一般步骤
1.查看网页地址,F2查看要获取的图片url
2.根据要获取的url图片地址,写出合适的正则表达式
例:获取页面中如下区域图片:https://blog.csdn.net/julielele?spm=3001.5343
F12查看图片链接
获得正则表达式:
format = r'src="http://www.kaotop.com/skin/sinaskin/image/nopic.gif" alt'
代码示例
import os import re,urllib.request import time def getImage(format,url,filePath): ''' :param format: 匹配的正则表达式 :param url: 获取图片的网址 :param filePath: 获取的图片存入的文件夹 :return: ''' request = urllib.request.urlopen(url) buf = request.read().decode('utf-8') # 获取符合条件的图片链接 listurl = re.findall(format,buf) print(listurl) #筛选拼接图片链接 res=[] for url in listurl: res.append(url+".png") timestr = time.strftime("%Y-%m-%d-%H-%M-%S",time.localtime()) path = filePath+"img"+timestr+"\" isExists=os.path.exists(path) if not isExists: os.makedirs(path) index = 0 for url in res: print(url) try: f = open(path+str(index)+'.png', 'wb') request = urllib.request.urlopen(url) buf = request.read() f.write(buf) index = index + 1 except Exception: continue finally: #关闭文件 f.close() url = "https://blog.csdn.net/julielele?spm=3001.5343" #匹配截取开头的url('结尾的.png后的数据 # format = r'url('(.*).png' format = r'src="http://www.kaotop.com/skin/sinaskin/image/nopic.gif" alt' filePath = "d:img" getImage(format,url,filePath)
运行后结果:
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)