2018-12-16 post 请求

2018-12-16  本文已影响0人  最近练习R语言
request.session()

目的是模拟登陆,得到登陆后的网页
类似方法设置cookie在header里,或者cookies=字典(方法在chrome分析的03小技巧中)

session模拟登陆时

post数据若能找到form表格的input接口与action的url,则ok
如果没有,需要抓包(登陆时勾选preserve log)查询登陆请求的network

cookies
import requests

response=requests.get('http://www.baidu.com')
a=requests.utils.dict_from_cookiejar( response.cookies)
b=requests.utils.cookiejar_from_dict(a)
##cookies 变字典

requests.utils.unquote('https://tieba.baidu.com/f?kw=%E6%B5%81%E6%B5%AA%E6%B1%89')
requests.utils.quote('https://tieba.baidu.com/f?kw=流浪汉')
#url地址编码解码
response=requests.get('http://www.baidu.com',verify=False)
#ssl证书settled

response=requests.get('http://www.baidu.com',timeout=10)

assert response.status_code==200
#状态码
from retrying import retrying
@retry(stop_max_attempt_number=7)
def blabla
#retry模块

import requests
from retrying import retry

headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"}

def _parse_url(url,method,data):
    print('*'*20)
    if method=='POST':
        response=requests.post(url,data=data.headers=headers) 
    else:     
        response=requests.get(url,headers=headers,timeout=3)
    assert response.status_code==200
    return response.content.decode()


@retry(stop_max_attempt_number=3)###retry
def parse_url(url,method='GET',data=None):
    try:
        html_str=_parse_url(url)
    except:
        html_str=None
        
    return html_str


if __name__=='__main__':
    url='http://www.baidu.com'
    print(parse_url(url))
上一篇下一篇

猜你喜欢

热点阅读