regex - Regular Expression to match most explicit string -


i have experience regular expressions far expert level , need way match record explicit string in file each record begins unique 1-5 digit integer , padded various other characters when shorter 5 digits. example, file has records begin with:

32000 3201x 32014 320xy 

in example, non-numeric characters represent wildcards. thought following regex examples work rather match record explicit number, match record least explicit number. remember, not know in file need test possibilities locate explicit match.

if need search 32000, regex looks like:     /^3\d{4}|^32\d{3}|^320\d{2}|^3200\d|^32000/   should match 32000 matches 320xy  if need search 32014, regex looks like:     /^3\d{4}|^32\d{3}|^320\d{2}|^3201\d|^32014/   should match 32014 matches 320xy  if need search 32015, regex looks like:     /^3\d{4}|^32\d{3}|^320\d{2}|^3201\d|^32015/   should match 3201x matches 320xy 

in each case, matched result least specific numeric value. tried reversing regex follows still same results: /^32014|^3201\d|^320\d{2}|^32\d{3}|^3\d{4}/

any appreciated.

ok, did quite tinkering here. 99% percent sure pretty impossible (if don't cheat , interpolate code regex). reason need negative lookbehind variable length @ point.

however, came 2 alternatives. 1 if want find "most exact match", second 1 if want replace something. here go:

/(32000)|\a(?!.*32000).*(3200\d)|\a(?!.*3200[0\d]).*(320\d\d)|\a(?!.*320[0\d][0\d]).*(32\d\d\d)|\a(?!.*32[0\d][0\d][0\d]).*(3\d\d\d\d)/m 

question:

so "most exact match" here?

answer:

the concatenation of 5 matched groups - \1\2\3\4\5. in fact 1 of them match, other 4 empty.

/(32000)|\a(?!.*32000)(.*)(3200\d)|\a(?!.*3200[0\d])(.*)(320\d\d)|\a(?!.*320[0\d][0\d])(.*)(32\d\d\d)|\a(?!.*32[0\d][0\d][0\d])(.*)(3\d\d\d\d)/m 

question:

how can use replace "most exact match"?

answer:

in case "most exact match" concatenation of \1\3\5\7\9, have matched other things before that, namely \2\4\6\8 (again, 1 of these can non empty). therefore if want replace "most exact match" fubar can match above regex , replace \2\4\6\8fubar

another way can think (and might helpful) "most exact match" last matched line of either of 2 regexes.

two things note here:

  • i used ruby style re, \a means beginning of string (not beginning of line - ^). \m means multi line mode. should able find syntax same things in language/technology long uses flavor of pcre.
  • this can slow. if don't find exact match might possibly have match , replace entire string (if non exact match can found @ end of string).

Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -