python3.7---爬取网页图片

发布时间：2019-07-19 09:56:49编辑：auto阅读（2643）

#!/usr/bin/python

import re
import urllib
import urllib.request #python3中urlopen、urlritrieve都在request库里面了，所以要导入此库

def htmlGet(url):
page = urllib.request.urlopen(url)
html = page.read()
return html

def imgGet(html):
res = r'src="(https.*?.jpg)"'
imgre = re.compile(res)
imglist = re.findall(imgre,html.decode("utf-8")) #html不加后面的会报错typeerror，因为编码格式的变化，这里需要指定一下
x = 0
for i in imglist:
urllib.request.urlretrieve(i,"%s.jpg" % x)
x+=1

html = htmlGet("http://***")
imgGet(html)

关键字：

上一篇： python提取文件名改进

下一篇： Python字符串，列表



Run博客上线，欢迎访问
内容如有侵犯，请立即联系管理员删除
本站内容仅供学习和参阅，不做任何商业用途

搜索

热门推荐

H3C基本命令大全
 53531
H3C IRF原理及配置
 40353
Python exit()函数
 34756
python全系列官方中文文档
 30515
python 获取网卡实时流量
 25393
1.常用turtle功能函数
 25183
python 获取Linux和Windows硬件信息
 23597
天天基金网数据接口
 18871
Selenium使用代理IP&无头模式访问网站
 15174
Selenium&Pytesseract模拟登录+验证码识别
 14687

最新文章

LangGraph Studio可视化
 1151°
LangSmith开发-应用入门
 1073°
LangGraph开发-多轮对话问答机器人
 1143°
LangGraph开发-条件分支/循环图实战
 1162°
LangGraph开发-生态介绍，入门demo实战
 1198°
LangChain-接入12306-HTTP MCP智能体
 1353°
LangChain接入自定义爬虫-MCP工具
 1312°
LangChain接入Filesystem-MCP工具
 1286°
LangChain搭建MCP服务端和客户端流程
 1383°
LangGraph与MCP技术概述
 1327°

博主信息

姓名：Run
职业：谜
邮箱：383697894@qq.com
定位：上海 · 松江

扫我打开

友情链接

百度 淘宝 腾讯 慕课网 CSDN 博客园 51cto博客