python爬取表情包

2019-03-13 本文已影响52人 Jupiter_19

这几日在知乎上看到个话题—有哪些沙雕表情包。就想着利用Python把图片保存到本地而非手动下载。然而尝试一下后发现，知乎已经不让第三方爬虫工具抓取了。于是就换了豆瓣里的一个网页：https://www.douban.com/group/topic/128794851/。成功实现了爬取表情包。

爬取结果

全部代码

作为一个学数学的人，平时不会去使用re、request等网页的库。大概梳理一下代码思路。

url = 'https://www.douban.com/group/topic/128794851/'
data = requests.get(url).text

网页源码

fir = re.findall(r'img src=".*?" width', data)
fir = str(fir).replace('img src="','').replace('" width','')
fir = fir.split(',')[0:-1]

response = requests.get(url)
img_data = response.content
image = Image.open(BytesIO(img_data))
image.save(img_path)