Overview
Hmm, changing the regexp totally seems dangerous.
what about
r"^GNU ld [^$]* (\d+\.\d+)[a-z0-9.-]*$"
That would keep the string and only strips any additional version stuff. Change is minimal (the added space before the number and the extension).
Better still: r"^GNU ld[^$]* (\d+\.\d+)[a-z0-9.-]*$"
Don't add a space, only move it from front to back.
Thank you for looking into this and your proposal.
It is closer to the original and seems to do its job fine in the mentioned openSUSE distros.
On the other side i would like to keep the changed regular expression if possible as a reference to the proposal in the upstream issue until upstream indicates a preference for another, e.g. your, solution or if my version is really "dangerous" or "wrong" for this package. Is it?
My idea was to somehow completely ignore the "additional version stuff" in a way, that does not break detection if it suddenly starts to include unexpected content, for example chars outside your added character set.
What do you think?
Can you agree?
It depends on the fact what other distros have in the text field. If it's always (GNU ...) your version is fine and probably better than the other one. If it's more open, then your version will break ;-)
P.S. For future regexp: \(GNU Binutils.*\)
is better written as \(GNU Binutils[^)]*\)
. This way the regexp cannot break out of the brackets and thus usually causes a failure instead of unexpected behaviour in case the strings don't follow expectations.
Thanks for the proposed regex tightening. I'll create a new submit request with it included.
Regarding the dangers of the extended assumptions about the string to parse i agree with you.
My hope is, that users from other distros will post their output of the version string in the upstream issue before a decision how to actually change the regex is accepted.