爬虫的小坑总结

2019-07-21  本文已影响0人  Leernh


想要爬取京东商品的数据,但是get请求数据的时候发现必须要登录才行,百度发现要在请求中加入headers模拟用浏览器发送请求,于是继续查了一下什么是UserAgemt,在requests.get(url, hearders=UA)加上UA后就能正常访问了。

什么是User-Agent

User-Agent是Http协议中的一部分,属于头域的组成部分,User Agent也简称UA。用较为普通的一点来说,是一种向访问网站提供你所使用的浏览器类型、操作系统及版本、CPU 类型、浏览器渲染引擎、浏览器语言、浏览器插件等信息的标识。UA字符串在每次浏览器 HTTP 请求时发送到服务器!

一些常用的User-Agent

1) Chrome

Win7:

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.163 Safari/535.1

2) Firefox

Win7:

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0

3) Safari

Win7:

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50

4) Opera

Win7:

Opera/9.80 (Windows NT 6.1; U; zh-cn) Presto/2.9.168 Version/11.50

5) IE

Win7+ie9:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; Tablet PC 2.0; .NET4.0E)

Win7+ie8:

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.3)

WinXP+ie8:

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.0)

WinXP+ie7:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)

WinXP+ie6:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

6) 傲游

傲游3.1.7在Win7+ie9,高速模式:

Mozilla/5.0 (Windows; U; Windows NT 6.1; ) AppleWebKit/534.12 (KHTML, like Gecko) Maxthon/3.0 Safari/534.12

傲游3.1.7在Win7+ie9,IE内核兼容模式:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E)

7) 搜狗

搜狗3.0在Win7+ie9,IE内核兼容模式:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E; SE 2.X MetaSr 1.0)

搜狗3.0在Win7+ie9,高速模式:

Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.33 Safari/534.3 SE 2.X MetaSr 1.0

8) 360

360浏览器3.0在Win7+ie9:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E)

9) QQ浏览器

QQ浏览器6.9(11079)在Win7+ie9,极速模式:

Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.41 Safari/535.1 QQBrowser/6.9.11079.201

QQ浏览器6.9(11079)在Win7+ie9,IE内核兼容模式:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E) QQBrowser/6.9.11079.201

10) 阿云浏览器

阿云浏览器1.3.0.1724 Beta(编译日期2011-12-05)在Win7+ie9:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

上一篇下一篇

猜你喜欢

热点阅读