注意
:在问这个问题时,仅提取主体使用的头流的正确方法
prefetch=False。该选项此后已重命名为,
stream并且布尔值取反,因此您需要
stream=True。
原始答案如下。
一旦使用完
iter_content(),您就必须继续使用它;
.text间接在后台使用相同的接口(通过
.content)。
换句话说,通过完全使用
iter_content(),您必须
.text手动完成工作:
from requests.compat import chardetr = requests.get("http://www.december.com/html/demo/hello.html", prefetch=False)peek = r.iter_content(256).next()mime = magic.from_buffer(peek, mime=True)if mime == "text/html": contents = peek + b''.join(r.iter_content(10 * 1024)) encoding = r.encoding if encoding is None: # detect encoding encoding = chardet.detect(contents)['encoding'] try: textcontent = str(contents, encoding, errors='replace') except (LookupError, TypeError): textcontent = str(contents, errors='replace') print(textcontent)
假设您使用Python 3。
另一种方法是发出2个请求:
r = requests.get("http://www.december.com/html/demo/hello.html", prefetch=False)mime = magic.from_buffer(r.iter_content(256).next(), mime=True)if mime == "text/html": print(r.requests.get("http://www.december.com/html/demo/hello.html").text)
Python 2版本:
r = requests.get("http://www.december.com/html/demo/hello.html", prefetch=False)peek = r.iter_content(256).next()mime = magic.from_buffer(peek, mime=True)if mime == "text/html": contents = peek + ''.join(r.iter_content(10 * 1024)) encoding = r.encoding if encoding is None: # detect encoding encoding = chardet.detect(contents)['encoding'] try: textcontent = unipre(contents, encoding, errors='replace') except (LookupError, TypeError): textcontent = unipre(contents, errors='replace') print(textcontent)
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)