Python网页图片集的制作_随笔

Python网页图片集的制作

往往在浏览网页时有很多图片我们想全部收集，但是一张一张弄又太慢，这个代码直接将图片一键保存。

#导人爬虫峰
import requests,re,os
# 读取网址str
response = requests.get('http://www.netbian.com/s/wangzherongyao/')
string = response.text
pattern = re.compile(r'http://[^s]*jpg')
result = re.findall(pattern, string)
for p in result:
    if p == 'http://img.netbian.com/file/2020/0907/e1b3c3085b8ed9cf769758e36029ed62.jpg':
        result.remove(p)
for p in result:
    if p == 'http://img.netbian.com/file/2021/1026/95a452ab2a80121473ceb1fce3e88cfc.jpg':
        result.remove(p)

这里我想保存这个网址的图片，可以在其源代码找出图片代码，如果有不想要的，遍历从列表删除即可。

l =''
e=''
b =''
z=' 
我们先把网页头写入txt，保证网页正常（注：网页各字符集间没有空格，否则网页无法显示） 
o=2
for i in range(23):
    response = requests.get(f" http://www.netbian.com/s/wangzherongyao/index_{o}.htm",)
    string = response.text
    pattern = re.compile(r'http://[^s]*jpg')
    result = re.findall(pattern, string)
    for p in result:
        if p == 'http://img.netbian.com/file/2020/0907/e1b3c3085b8ed9cf769758e36029ed62.jpg':
            result.remove(p)
    for p in result:
        if p == 'http://img.netbian.com/file/2021/1026/95a452ab2a80121473ceb1fce3e88cfc.jpg':
            result.remove(p)
    for p in result:
        with open('pics.txt', 'a') as a:
            a.writelines(f'' + ''+'rn')
    o = o+1
v = ''
m = ''
n = ""
with open('pics.txt', 'a') as a:
    a.writelines('rn')
    a.writelines(v+'rn')
    a.writelines(m+'rn')
    a.writelines(n)

 
这里也是同一道理，因为网页有规律，所以我偷个懒遍历就可以了。 
with open('pics.txt') as f:
    text = f.read()

with open('pics.html', 'w') as w:
    w.write(text)
 
最后这里改后缀名，原文件不会改变，只会增加一个网页文件 
这样打开网页，图片就在里面了 
					
										


					
						欢迎分享，转载请注明来源：内存溢出
原文地址: http://outofmemory.cn/zaji/5679977.html

Python网页图片集的制作

发表评论

评论列表（0条）