python - Determine if two html elements are siblings -
so, i'm building little utility automatically grab text article-style page. thought on how best solve problem find elements more ~150 chars of text:
document.xpath("//*[string-length( text() ) > 150 ]")
i list of elements , want identify of elements siblings, if possible i'd avoid doing more dom traversal sake of efficiency.
is there nice way of doing in lxml?
given list of nodes l
, check whether parent of pair of elements same (where parent obtained .getparent()
):
def get_siblings(l): in l: b in l: if < b: # tests elements' memory addresses, # don't duplicate pairs or test # elements against if a.getparent() == b.getparent(): yield (a, b)
or maybe simpler:
def get_siblings(l): return ((a, b) in l b in l if < b , a.getparent() == b.getparent())
you use counter find parents more 1 sibling, , find elements parents:
from collections import counter def get_siblings(l): c = counter([x.getparent() x in l]) return [x x in l if c[x.getparent()] > 1]
Comments
Post a Comment