python - How does the regex "\" character and grouping "()" character work together? -


i trying see statements following pattern matches:

\(*[0­-9]{3}\)*-­*[0-­9]{3}­\d\d\d+ 

i little confused because grouping characters () have \ before it. mean statement must have ( , )? mean statements without ( or ) unmatched?

statements: '404­678­2347' '(123)­1247890' '456­900­900' '(678)­2001236' '404123­1234' '(404123­123' 

context important:

  • re.match(r'\(', content) matches literal parenthesis.
  • re.match(r'\(*', content) matches 0 or more literal parentheses, making parens optional (and allowing more 1 of them, that's bug).

since intended behavior isn't "0 or more" rather "0 or 1", should written r'\(?' instead.


that said, there's whole lot regex that's silly. i'd consider instead:

[(]?\d{3}[)]?-?\d{6,} 
  • using [(]? avoids backslashes, , consequently easier read whether it's rendered str() or repr() (which escapes backslashes).
  • mixing [0-9] , \d silly; better pick 1 , stick it.
  • using * in place of ? silly, unless really want match (((123))456-----7890.
  • \d{3}\d\d\d+ matches 3 digits, 3 or more additional digits. why not match 6 or more digits in first place?

Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -