bash - How do I count multiple overlapping strings and get the total occurences per line (awk or anything else) -


i have input file this:

315secondbin    x12121321211332123x 315firstbin 3212212121x 315thirdbin 132221312 316firstbin 121 316secondbin    1212 

what want count how many instances of few different strings (say "121" , "212") exist in each line counting overlap. expected output be:

6 5 0 1 2 

so modified awk thread use or operator in hopes count meets either condition:

{ count = 0 $0 = tolower($0) while (length() > 0) {     m = match($0, /212/ || /121/)     if (m == 0)          break     count++     $0 = substr($0, m + 1) } print count } 

unfortunately, output this:

8 4 0 2 3 

but if leave out or counts perfectly. doing wrong?

also, run script on file ymaz.txt running:

 cat ymaz.txt | awk -v "pattern=" -f count3.awk 

as alternate approach tried this:

{ count = 0 $0 = tolower($0) while (length() > 0) {     m = match($0, /212/) y = match($0, /121/)     if ((m == 0) && (y == 0))          break     count++     $0 = substr($0, (m + 1) + (y + 1)) } print count } 

but output this:

1 1 0 1 1 

what doing wrong? know should understanding code , not cutting , pasting stuff together, that's skill level @ point.

btw when don't have or in there (ie i'm searching 1 string) works perfectly.

you're making complicated:

{     count=0     while ( match($0,/121|212/) ) {         count++         $0=substr($0,rstart+1)     }     print count }  $ awk -f tst.awk file 6 5 0 1 2 

your fundamental problem confusing condition regexp. regexp can compared string form condition, , when string in question $0 can leave out , use regexp shorthand $0 ~ regexp in context what's being tested still condition. 2nd arg match() regexp, not condition. | or operator in regexp while || or operator in condition. /.../ regexp delimiters.

/foo/ regexp

$0 ~ /foo/ condition

/foo/ in conditional context shorthand $0 ~ /foo/ in other context regexp.

/foo/ || /bar in conditional context shorthand $0 ~ /foo/ || $0 ~ /bar/ 2nd arg match() awk assumes intended write:

match($0,($0 ~ /foo/ || $0 ~ /bar/)) 

i.e. test current record against foo or bar , if true condition evaluates 1 , 1 given match() it's 2nd arg.

look:

$ echo foo | gawk 'match($0,/foo/||/bar/)'         $ echo foo | gawk '{print /foo/||/bar/}'   1 $ echo 1foo | gawk 'match($0,/foo/||/bar/)'        1foo 

get book effective awk programming, 4th edition, arnold robbins.


Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -