如果您能熟练使用BeautifulSoup,则只需将soupselect添加到您的库中。
Soupselect是BeautifulSoup的CSS选择器扩展。
用法:
>>> from BeautifulSoup import BeautifulSoup as Soup>>> from soupselect import select>>> import urllib>>> soup = Soup(urllib.urlopen('http://slashdot.org/'))>>> select(soup, 'div.title h3')[<h3><span><a href='//science.slashdot.org/'>Science</a>:</span></h3>, <h3><a href='//slashdot.org/articles/07/02/28/0120220.shtml'>Star Trek</h3>,..]
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)