假设您在要解析的页面上,Selenium将源HTML存储在驱动程序的
page_source属性中。这样,你会加载
page_source到
BeautifulSoup如下:
In [8]: from bs4 import BeautifulSoupIn [9]: from selenium import webdriverIn [10]: driver = webdriver.Firefox()In [11]: driver.get('http://news.ycombinator.com')In [12]: html = driver.page_sourceIn [13]: soup = BeautifulSoup(html)In [14]: for tag in soup.find_all('title'): ....: print tag.text ....: ....: Hacker News
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)