HTTPError:HTTP错误403:禁止

HTTPError:HTTP错误403:禁止,第1张

HTTPError:HTTP错误403:禁止

在当前代码内:

Python 2.X
import urllib2, sysfrom BeautifulSoup import BeautifulSoupsite= "http://en.wikipedia.org/wiki/StackOverflow"hdr = {'User-Agent': 'Mozilla/5.0'}req = urllib2.Request(site,headers=hdr)page = urllib2.urlopen(req)soup = BeautifulSoup(page)print soup
的Python 3.X
from bs4 import BeautifulSoupfrom urllib.request import Request, urlopensite= "http://en.wikipedia.org/wiki/StackOverflow"hdr = {'User-Agent': 'Mozilla/5.0'}req = Request(site,headers=hdr)page = urlopen(req)soup = BeautifulSoup(page)print(soup)
带有Selenium的Python 3.X(执行Javascript函数
from selenium import webdriver as driverbrowser = driver.PhantomJS()p = browser.get("http://en.wikipedia.org/wiki/StackOverflow")assert "Stack Overflow - Wikipedia" in browser.title

修改后的版本起作用的原因是因为Wikipedia检查User-Agent是“流行的浏览器”



欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5096299.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-11-16
下一篇 2022-11-16

发表评论

登录后才能评论

评论列表(0条)

保存