regex - Extracting phone number issue in R -
having numbers this:
ll <- readlines(textconnection("(412) 573-7777 opt 1 563.785.1655 x1797 (567) 523-1534 x7753 (567) 483-2119 x 477 (451) 897-mall (342) 668-6255 ext 7 (317) 737-3377 opt 4 (239) 572-8878 x 3 233.785.1655 x1776 (138) 761-6877 x 4 (411) 446-6626 x 14 (412) 337-3332x19 412.393.3177 x24 327.961.1757 ext.4")) what regex should write get:
xxx-xxx-xxxx i tried one:
gsub('[(]([0-9]{3})[)] ([0-9]{3})[-]([0-9]{4}).*','\\1-\\2-\\3',ll) it doesn't cover possibilities. think can using several regex patterns, think can done using single regex.
if want extract numbers represented letters, can use following regex in gsub:
gsub('[(]?([0-9]{3})[)]?[. -]([a-z0-9]{3})[. -]([a-z0-9]{4}).*','\\1-\\2-\\3',ll) see ideone demo
you can remove a-z character classes match numbers no letters.
regex:
[(]?- optional(([0-9]{3})- 3 digits[)]?- optional)[. -]- either dot, or space, or hyphen([a-z0-9]{3})- 3 digit or letter sequence[. -]- either dot, or space, or hyphen([a-z0-9]{4})- 4 digit or letter sequence.*- number of characters end
Comments
Post a Comment