Python四期爬虫作业

【Python爬虫】-第十三次 requests模块练习

2017-08-31  本文已影响10人  阳光总在风雨后_db57

13.第十三次 requests模块练习
一、构造一个访问阳光电影网的请求(url,headers)
二、输出请求的状态码
三、输出请求的网页源码
四、将源码保存成html文件(文件为'moive.html')

import requests

url='http://www.ygdy8.com/';
headers={
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'zh-CN,zh;q=0.8',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Cookie':'37cs_pidx=1; 37cs_user=37cs94837775047; 37cs_show=69; cscpvrich4016_fidx=1',
'Host':'www.ygdy8.com',
'If-Modified-Since':'Wed, 30 Aug 2017 03:36:20 GMT',
'If-None-Match':"052ed244121d31:530",
'Referer':'https://www.baidu.com/link?url=SVy4LWAQ4pWreyHTbiRvREyKpZQjwF6sLl4WQcnRfGC&wd=&eqid=f05b7daa0000cb640000000359a6c326',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'
}

req=requests.get(url,headers=headers);
status_code=req.status_code;
print(status_code)
if(status_code==200):
    req.encoding='gb2312'
    html=req.text;
print(html);
fp=open(r'C:\Users\Administrator\Desktop\test.html','w',encoding='gb2312')
#fp.encoding='gb2312'
fp.write(html);
fp.close();

上一篇下一篇

猜你喜欢

热点阅读