我可以使用从这里找到的代理访问网站内容:
https://free-proxy-list.net/
然后,使用该
requests模块创建播放负载,即可抓取该网站:
import requestsimport refrom bs4 import BeautifulSoup as soupr = requests.get('https://seekingalpha.com/symbol/AMAT/earnings', proxies={'http':'50.207.31.221:80'}).textresults = re.findall('Revenue of $[a-zA-Z0-9.]+', r)s = soup(r, 'lxml')titles = list(map(lambda x:x.text, s.find_all('span', {'class':'title-period'})))epas = list(map(lambda x:x.text, s.find_all('span', {'class':'eps'})))deciding = list(map(lambda x:x.text, s.find_all('span', {'class':re.compile('green|red')})))results = list(map(list, zip(titles, epas, results, epas)))
输出:
[[u'Q4: 11-16-17', u'EPS of .93 beat by .02', u'Revenue of .97B', u'EPS of .93 beat by .02'], [u'Q3: 08-17-17', u'EPS of .86 beat by .02', u'Revenue of .74B', u'EPS of .86 beat by .02'], [u'Q2: 05-18-17', u'EPS of .79 beat by .03', u'Revenue of .55B', u'EPS of .79 beat by .03'], [u'Q1: 02-15-17', u'EPS of .67 beat by .01', u'Revenue of .28B', u'EPS of .67 beat by .01'], [u'Q4: 11-17-16', u'EPS of .66 beat by .01', u'Revenue of .30B', u'EPS of .66 beat by .01'], [u'Q3: 08-18-16', u'EPS of .50 beat by .02', u'Revenue of .82B', u'EPS of .50 beat by .02'], [u'Q2: 05-19-16', u'EPS of .34 beat by .02', u'Revenue of .45B', u'EPS of .34 beat by .02'], [u'Q1: 02-18-16', u'EPS of .26 beat by .01', u'Revenue of .26B', u'EPS of .26 beat by .01'], [u'Q4: 11-12-15', u'EPS of .29 in-line ', u'Revenue of .37B', u'EPS of .29 in-line '], [u'Q3: 08-13-15', u'EPS of .33 in-line ', u'Revenue of .49B', u'EPS of .33 in-line '], [u'Q2: 05-14-15', u'EPS of .29 beat by .01', u'Revenue of .44B', u'EPS of .29 beat by .01'], [u'Q1: 02-11-15', u'EPS of .27 in-line ', u'Revenue of .36B', u'EPS of .27 in-line '], [u'Q4: 11-13-14', u'EPS of .27 in-line ', u'Revenue of .26B', u'EPS of .27 in-line '], [u'Q3: 08-14-14', u'EPS of .28 beat by .01', u'Revenue of .27B', u'EPS of .28 beat by .01'], [u'Q2: 05-15-14', u'EPS of .28 in-line ', u'Revenue of .35B', u'EPS of .28 in-line '], [u'Q1: 02-11-14', u'EPS of .23 beat by .01', u'Revenue of .19B', u'EPS of .23 beat by .01']]
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)