学习urllib中Post请求

2018-06-10  本文已影响0人  FirmStone

由于get请求方式比较简单,就不写了,贴一段代码作为回顾

get请求

from urllib import parse, request
import random

url = 'http://www.baidu.com/s'

keyword = input('请输入要搜的关键字:')
wd = {'wd': keyword}

encoded_wd = parse.urlencode(wd)
new_url = url + '?' + encoded_wd

print(new_url)
req = request.Request(url)
#为了防止被网站封ip,模仿浏览器访问网站
ua_list = [
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1",
    "Mozilla/5.0 (Windows NT 6.1; rv2.0.1) Gecko/20100101 Firefox/4.0.1",
    "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11",
    "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"
]

# 在User-Agent列表里随机选择一个User-Agent ;从序列中随机选取一个元素
user_agent = random.choice(ua_list)
req.add_header('User-Agent', user_agent)

response = request.urlopen(req)
print(response.read().decode('utf-8'))

Post请求

说明:post请求是利用有道翻译网页中对单词进行翻译时采用的post方式,这是前提。

1.利用抓包工具获取url

url='http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule '

2.编辑form表单

i=smart%0A&from=AUTO&to=AUTO&smartresult=dict&client=fanyideskweb&salt=1528635335125&sign=df74f2062975d33318777e2cdae0af41&doctype=json&version=2.1&keyfrom=fanyi.web&action=FY_BY_CLICKBUTTION&typoResult=false

拆分整理如下:

formbody = {
    "i": "smart",
    "from": "AUTO",
    "to": "AUTO",
    "smartresult": "dict",
    "client": "fanyideskweb",
    "doctype": "json",
    "version": "2.1",
    "keyfrom": "fanyi.web",
    "action": "FY_BY_CLICKBUTTION",
    "typoResult": "false"
}


此处用sublime正则表达式替换比较便捷

内容为    ^(.*)=(.*)$
替换为    "\1":"\2",
data = urllib.parse.urlencode(formbody)
request = urllib.request.Request(url, data=data, headers=headers)

print(urllib.request.urlopen(request).read().decode('utf-8'))

这时候遇见第一个error :
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.

解决完这个问题运行,返回结果是

{"errorCode":50}

又搜索,居然也有人趟过雷了,http://bbs.fishc.com/thread-96638-1-1.html,按照其中方法修改解决问题。

import urllib.request, urllib.parse

user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1"


url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'

headers = {'User-Agent': user_agent}
formbody = {
    "i": "smart",
    "from": "AUTO",
    "to": "AUTO",
    "smartresult": "dict",
    "client": "fanyideskweb",
    "doctype": "json",
    "version": "2.1",
    "keyfrom": "fanyi.web",
    "action": "FY_BY_CLICKBUTTION",
    "typoResult": "false"
}
#data = urllib.parse.urlencode(formbody).encode(encoding='utf-8')
data=bytes(urllib.parse.urlencode(formbody),encoding='utf-8')
request = urllib.request.Request(url, data=data, headers=headers)

print(urllib.request.urlopen(request).read().decode('utf-8'))

结果是:

{"type":"EN2ZH_CN","errorCode":0,"elapsedTime":10,"translateResult":[[{"src":"smart","tgt":"聪明的"}]]}


对于第二个问题:为啥删除_o就没问题了我也不知道。不过可以思考一下,
http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule直接用鼠标点击在浏览器里显示:

{"errorCode":50}

http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule跳转到有道。这给我自己提了个醒以后url一定要先在浏览器里打开试试

上一篇下一篇

猜你喜欢

热点阅读