我认为您可以执行以下 *** 作:
for section in soup.findAll('h2'): nextNode = section while True: nextNode = nextNode.nextSibling try: tag_name = nextNode.name except AttributeError: tag_name = "" if tag_name == "p": print nextNode.string else: print "*****" break
鉴于:
<h2>section1</h2><p>article1</p><p>article2</p><p>article3</p><h2>section2</h2><p>article4</p><p>article5</p><p>article6</p>
输出:
article1article2article3*****article4article5article6*****
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)