使用python和python NTLM浏览受NTLM保护的网站_python

概述我的任务是创建一个登录到公司门户的脚本转到特定页面,下载页面,将其与早期版本进行比较,然后根据已经发生的更改通过电子邮件发送给某个人.最后的部分很容易,但它是第一步给我带来最大的麻烦. 使用urllib2失败后(我试图在python中这样做)连接和大约4或5个小时的谷歌搜索我已经确定我无法连接的原因是由于网页上的NTLM身份验证.我尝试过在本网站和其他网站上找到的一系列不同的连接过程无济于事.基于我的任务是创建一个登录到公司门户的脚本转到特定页面,下载页面,将其与早期版本进行比较,然后根据已经发生的更改通过电子邮件发送给某个人.最后的部分很容易,但它是第一步给我带来最大的麻烦.

使用urllib2失败后(我试图在python中这样做)连接和大约4或5个小时的谷歌搜索我已经确定我无法连接的原因是由于网页上的NTLM身份验证.我尝试过在本网站和其他网站上找到的一系列不同的连接过程无济于事.基于NTLM example我做了：

import urllib2from ntlm import httpNtlmAuthHandleruser = 'username'password = "password"url = "https://portal.whatever.com/"passman = urllib2.httpPasswordMgrWithDefaultRealm()passman.add_password(None,url,user,password)# create the NTLM authentication handlerauth_NTLM = httpNtlmAuthHandler.httpNtlmAuthHandler(passman)# create and install the openeropener = urllib2.build_opener(auth_NTLM)urllib2.install_opener(opener)# create a headeruser_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; windows NT)'header = { 'Connection' : 'Keep-alive','User-Agent' : user_agent}response = urllib2.urlopen(urllib2.Request(url,None,header))

当我运行它(使用真实的用户名,密码和URL)时,我得到以下内容：

Traceback (most recent call last):  file "<stdin>",line 1,in <module>  file "ntlm2.py",line 21,in <module>    response = urllib2.urlopen(urllib2.Request(url,header))  file "C:\Python27\lib\urllib2.py",line 126,in urlopen    return _opener.open(url,data,timeout)  file "C:\Python27\lib\urllib2.py",line 400,in open    response = meth(req,response)  file "C:\Python27\lib\urllib2.py",line 513,in http_response    'http',request,response,code,msg,hdrs)  file "C:\Python27\lib\urllib2.py",line 432,in error    result = self._call_chain(*args)  file "C:\Python27\lib\urllib2.py",line 372,in _call_chain    result = func(*args)  file "C:\Python27\lib\urllib2.py",line 619,in http_error_302    return self.parent.open(new,timeout=req.timeout)  file "C:\Python27\lib\urllib2.py",line 438,in error     return self._call_chain(*args)  file "C:\Python27\lib\urllib2.py",in _call_chain     result = func(*args)  file "C:\Python27\lib\urllib2.py",line 521,in http_error_default     raise httpError(req.get_full_url(),hdrs,fp)  urllib2.httpError: http Error 401: Unauthorized

对我来说最有趣的事情是,最后一行表示发回401错误.根据我的read,401错误是NTLM启动时发送回客户端的第一条消息.我的印象是python-ntml的目的是为我处理NTLM进程.这是错的还是我只是错误地使用它？此外,我没有限制使用python,所以如果有一种更简单的方法用另一种语言做到这一点让我知道(从我看到的谷歌搜索没有).
谢谢！

解决方法如果站点使用的是NTLM身份验证,则生成的httpError的headers属性应该这样说：

>>> try:...   handle = urllib2.urlopen(req)... except IOError,e:...   print e.headers... <other headers>WWW-Authenticate: NegotiateWWW-Authenticate: NTLM

总结

以上是内存溢出为你收集整理的使用python和python NTLM浏览受NTLM保护的网站全部内容，希望文章能够帮你解决使用python和python NTLM浏览受NTLM保护的网站所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/1196517.html

使用python和python NTLM浏览受NTLM保护的网站

发表评论

评论列表（0条）