python - Determine if two html elements are siblings -


so, i'm building little utility automatically grab text article-style page. thought on how best solve problem find elements more ~150 chars of text:

document.xpath("//*[string-length( text() ) > 150 ]") 

i list of elements , want identify of elements siblings, if possible i'd avoid doing more dom traversal sake of efficiency.

is there nice way of doing in lxml?

given list of nodes l, check whether parent of pair of elements same (where parent obtained .getparent()):

def get_siblings(l):     in l:         b in l:             if < b: # tests elements' memory addresses,                        # don't duplicate pairs or test                        # elements against                 if a.getparent() == b.getparent():                     yield (a, b) 

or maybe simpler:

def get_siblings(l):     return ((a, b) in l                    b in l                    if < b                    , a.getparent() == b.getparent()) 

you use counter find parents more 1 sibling, , find elements parents:

from collections import counter def get_siblings(l):     c = counter([x.getparent() x in l])     return [x x in l if c[x.getparent()] > 1] 

Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -