使用urllib

发布时间：2019-03-16 22:24:09编辑：auto阅读（3157）

urlopen的基本用法:

工具为：python3（windows）

其完整表达式为:

urllib.request.urlopen(url, data=None, [timeout, ]*, cafile=None, capath=None, cadefault=False, context=None)

1、发出一个请求.打开bttpbin.org网页，此处为get方式的请求类型

>>>import urllib.request
>>> response = urllib.request.urlopen("http://httpbin.org")

#此处为将结果赋值给response
>>> print(response.read().decode('utf-8'))

#得到的response是bytes类型，所以我们需要使用decode

httpbin.org：可以以后用来做http测试

2、此处为POST 类型的请求需要使用到data

>>> import urllib.parse
>>> import urllib.request
>>> data = bytes(urllib.parse.urlencode({"word":"hello"}),encoding="utf8")

#需要创建data参数，需要为bytes类型，用urlencode将字典传过去
>>> response = urllib.request.urlopen("http://httpbin.org/post",data = data)
>>> print(response.read())

3、超时设置timeout

>>> import urllib.request
>>> response = urllib.request.urlopen("http://httpbin.org/get",timeout=1 )
>>> print(response.read())

发现下方有正常的响应

若超时的时间为0.1，如果出现异常，对异常进行捕获

>>> import socket
>>> import urllib.request
>>> import urllib.error

try:
response = urllib.request.urlopen("http://httpbin.org/get",timeout=0.1)
except urllib.error.URLError as e:
if isinstance(e.reason,socket.timeout):
print("TIME OUT")

会出现TIME OUT 结果。

发送请求之后出现响应

1、响应类型

>>> import urllib.request
>>> response = urllib.request.urlopen("http://httpbin.org")
>>> print(type(response))
<class 'http.client.HTTPResponse'>

2、状态码 响应头

>>> import urllib.request
>>> response =urllib.request.urlopen("http://httpbin.org")
>>> print(response.status) #此处为状态码，200显示为成功的意思
200
>>> print(response.getheaders()) #此处为获取所有的状态头，并且以元组的形式输出
[('Connection', 'close'), ('Server', 'gunicorn/19.9.0'), ('Date', 'Tue, 09 Oct 2018 12:49:34 GMT'), ('Content-Type', 'text/html; charset=utf-8'), ('Content-Length', '10122'), ('Access-Control-Allow-Origin', '*'), ('Access-Control-Allow-Credentials', 'true'), ('Via', '1.1 vegur')]

>>> print(response.getheader('Server'))
gunicorn/19.9.0

[此处表示为此处的服务器是由gunicorn/19.9.0所做]
response.read():获取响应体内容为bytes类型，我们可以用decode进行转化

>>> import urllib.request
>>> response = urllib.request.urlopen("http://httpbin.org")
>>> print(response.read().decode('utf-8'))

Request的基本用法

（如果我们想要发送header对象或者其他复杂东西，就需要用到Request）

>>> import urllib.request
>>> response = urllib.request.Request("http://httpbin.org")

>>> response = urllib.request.urlopen(request)

>>> print(response.read().decode('utf-8'))
正常输出，与上方直接输入的结果是完全一致，有了Request能够更加方便

此处为模仿火狐浏览器进行请求

from urllib import request,parse
url = "http://httpbin.org/post"
headers = {
"User-Agent":'Mozllia/4.0(compatible;MSIE 5.5;Windows NT)',
"Host":'httpbin.org'
}
dict = {
'name':'Germey'
}
data = bytes(parse.urlencode(dict),encoding="utf8")
req = request.Request(url=url,data=data,headers=headers,method="POST")
response= request.urlopen(req)
print(response.read().decode("utf-8"))

也会出现结果

关键字：

上一篇： Python学习：集合

下一篇： selenium+webDriver+h



搜索

热门推荐

最新文章

博主信息

姓名：Run
职业：谜
邮箱：383697894@qq.com
定位：上海 · 松江

扫我打开

友情链接

百度 淘宝 腾讯 慕课网 CSDN 博客园 51cto博客