requests包: 10分钟80%场景
2020-10-14 本文已影响0人
RandyLou
1. GET请求带参数
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.get("http://httpbin.org/get", params=params)
print(r.url)
print(r.text)
2. POST请求
2.1 表单提交
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.post("http://httpbin.org/post", data=params)
print(r.url)
print(r.text)
2.2 JSON请求体
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.post("http://httpbin.org/post", data=json.dumps(params))
print(r.url)
print(r.text)
由requests
自行转JSON:
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.post("http://httpbin.org/post", json=json.dumps(params))
print(r.url)
print(r.text)
3. 文件上传
上传文件同时提交form表单:
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
files = {'simple_request.py': ('simple_request.py', open('./simple_request.py', 'rb'), 'application/vnd.ms-excel',
{'Expires': '0'})}
r = requests.post("http://httpbin.org/post", data=params, files=files)
print(r.url)
print(r.text)
大多数场景下,如果是通过代码来上传文件的,我们能拿到的往往不会是文件,而是一个io对象或者字节数组,只需要将bytes
封装为ByteIO
传递即可:
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
b = b"abcefg"
files = {'simple_request.py': ('simple_request.py', BytesIO(b), 'application/vnd.ms-excel',
{'Expires': '0'})}
r = requests.post("http://httpbin.org/post", data=params, files=files)
print(r.url)
print(r.text)
4. 请求超时
这里的超时有两个值,连接超时(connect timeout
)和读取超时(read timeout
),连接超时是指客户端和服务器建立连接的超时时间,读取超时是指客户端发送请求后等待服务器响应的时间。
传给requests
的timeout
可以是一个数字,这个数字会被同时当作连接超时和读取超时:
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.post("http://httpbin.org/post", data=params, timeout=1)
print(r.url)
print(r.text)
也可以传一个tuple,分别指定:
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.post("http://httpbin.org/post", data=params, timeout=(3, 4))
print(r.url)
print(r.text)
如果服务响应的速度本来就慢,而且你希望无限期等待,可以将timeout
设置为None
:
import requests
import json
import os
import sys
from io import BytesIO
params = {"a": 1, "b": 2, "c": [3, 4]}
r = requests.post("http://httpbin.org/post", data=params, timeout=None)
print(r.url)
print(r.text)
5. 代理
也是比较常见的场景,如果做爬虫的话,被抓取的服务往往有特定防爬策略,最简单的就是单个IP访问量异常的识别,通过使用多个代理服务器,我们就能降低每个IP的访问量,避免被限制:
import requests
url = 'https://httpbin.org/post'
def response_hook(r, *args, **kwargs):
print("response", r, args, kwargs)
r.headers["WTF"] = "what is the fuck"
return r
proxies = {
'http': 'http://127.0.0.1:8888', # 127.0.0.1:8888 是我本机的Fiddler代理地址
'https': 'http://127.0.0.1:8888',
'ftp': 'http://127.0.0.1:8888',
}
r = requests.post(url, data={"a": 1, "b": 2}, hooks={"response": response_hook}, proxies=proxies, verify=False)
print(r.headers["WTF"])