尝试:
def stringify_children(node): from lxml.etree import tostring from itertools import chain parts = ([node.text] + list(chain(*([c.text, tostring(c), c.tail] for c in node.getchildren()))) + [node.tail]) # filter removes possible Nones in texts and tails return ''.join(filter(None, parts))
例:
from lxml import etreenode = etree.fromstring("""<content>Text outside tag <div>Text <em>inside</em> tag</div></content>""")stringify_children(node)
产生:
'nText outside tag <div>Text <em>inside</em> tag</div>n'
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)