如果只需要两个
<br />标签之间的任何文本,则可以执行以下 *** 作:
from BeautifulSoup import BeautifulSoup, NavigableString, Taginput = '''<br />important Text 1<br /><br />Not important Text<br />important Text 2<br />important Text 3<br /><br />Non important Text<br />important Text 4<br />'''soup = BeautifulSoup(input)for br in soup.findAll('br'): next_s = br.nextSibling if not (next_s and isinstance(next_s,NavigableString)): continue next2_s = next_s.nextSibling if next2_s and isinstance(next2_s,Tag) and next2_s.name == 'br': text = str(next_s).strip() if text: print "Found:", next_s
但是也许我误解了你的问题?您对问题的描述似乎与示例数据中的“重要” /“不重要”不符,因此我不再赘述;)
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)