import pandas as pddaily_info=pd.read_HTML('https://www.investing.com/earnings-calendar/',flavor='HTML5lib')print(daily_info)
不幸的是出现了:
urllib.error.httpError: http Error 403: ForbIDden
无论如何要解决它吗?
解决方法 假装是一个浏览器:import requestsurl = 'https://www.investing.com/earnings-calendar/'header = { "User-Agent": "Mozilla/5.0 (X11; linux x86_64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/50.0.2661.75 Safari/537.36","X-Requested-With": "XMLhttpRequest"}r = requests.get(url,headers=header)dfs = pd.read_HTML(r.text)
结果:
In [201]: len(dfs)Out[201]: 7In [202]: dfs[0]Out[202]: 0 1 2 30 NaN NaN NaN NaNIn [203]: dfs[1]Out[203]: Unnamed: 0 Company EPS / Forecast Revenue / Forecast.1 Market Cap Time Monday,April 24,2017 NaN NaN NaN NaN NaN NaN NaN1 NaN Acadia (AKR) -- / 0.11 -- / -- 2.63B NaN2 NaN Agree (ADC) -- / 0.39 -- / -- 1.34B NaN3 NaN Alcoa (AA) -- / 0.53 -- / -- 5.84B NaN4 NaN American Campus (ACC) -- / 0.27 -- / -- 6.62B NaN5 NaN Ameriprise Financial (AMP) -- / 2.52 -- / -- 19.76B NaN6 NaN Avacta Group (AVTG) -- / -- 1.26M / -- 47.53M NaN7 NaN Bank of Hawaii (BOH) 1.2 / 1.08 165.8M / -- 3.48B NaN8 NaN Bank of Marin (BMRC) 0.74 / 0.8 -- / -- 422.29M NaN9 NaN Banner (BANR) -- / 0.68 -- / -- 1.82B NaN10 NaN barrick Gold (ABX) -- / 0.2 -- / -- 22.44B NaN11 NaN barrick Gold (ABX) -- / 0.28 -- / -- 30.28B NaN12 NaN Berkshire Hills Bancorp (BHLB) -- / 0.54 -- / -- 1.25B NaN13 NaN brookfIEld Canada Office PropertIEs (BoxC) -- / -- -- / -- NaN NaN...总结
以上是内存溢出为你收集整理的python – HTTP错误403:读取HTML时禁止全部内容,希望文章能够帮你解决python – HTTP错误403:读取HTML时禁止所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)