Python 爬虫学习

2020-07-09 本文已影响0人田福成

[python爬虫-Response对象的属性]

(https://www.cnblogs.com/hjw1/p/8271422.html)
'''
coding: UTF-8
import requests
url="http://www.baidu.com/s?wd="
wd="链条"
url=url+wd
r=requests.get(url)
print(r.status_code) # 查看访问状态码 200为ok 是成功的
200
print(r.text)
'''

python爬虫-Response对象

r=requests.get("http://www.baidu.com/")

属性

r.status_code

http请求的返回状态，200表示连接成功，404表示连接失败

r.text

http响应内容的字符串形式，url对应的页面内容

r.encoding

从HTTP header中猜测的响应内容编码方式

r.apparent_encoding

从内容分析出的响应内容的编码方式（备选编码方式）

r.content

HTTP响应内容的二进制形式

r.headers

http响应内容的头部内容

requests库的7个主要方法

https://www.cnblogs.com/liutongqing/p/6978155.html#http协议

requests.request() 构造一个请求，支撑以下各方法的基础方法
requests.get() 获取HTML网页的主要方法，对应于HTTP的GET
requests.head() 获取HTML网页头信息的方法，对应于HTTP的HEAD
requests.post() 向HTML网页提交POST请求的方法，对应于HTTP的POST
requests.put() 向HTML网页提交PUT请求的方法，对应于HTTP的PUT
requests.patch() 向HTML网页提交局部修改请求，对应于HTTP的PATCH
requests.delete() 向HTML页面提交删除请求，对应于HTTP的DELETE