用python的lxml剥离内联标签

用python的lxml剥离内联标签,第1张

用python的lxml剥离内联标签

我想

strip_tags
strip_elements
你在每一种情况下想要的东西。例如,此脚本

from lxml import etreetext = "<x>hello, <z>keep me</z> and <y>ignore me</y>, and here's some <y>more</y> text</x>"tree = etree.fromstring(text)print etree.tostring(tree, pretty_print=True)# Remove the <z> tags, but keep their contents:etree.strip_tags(tree, 'z')print '-' * 72print etree.tostring(tree, pretty_print=True)# Remove all the <y> tags including their contents:etree.strip_elements(tree, 'y', with_tail=False)print '-' * 72print etree.tostring(tree, pretty_print=True)

…产生以下输出:

<x>hello, <z>keep me</z> and <y>ignore me</y>, andhere's some <y>more</y> text</x>------------------------------------------------------------------------<x>hello, keep me and <y>ignore me</y>, andhere's some <y>more</y> text</x>------------------------------------------------------------------------<x>hello, keep me and , andhere's some  text</x>


欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/zaji/5654958.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-16
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存