Scrapy structure

2017-07-15  本文已影响0人  方方块

READ THIS

Paste_Image.png

Item.py

for making scrapy crawled data more ordered and serializable
how to use

Pipeline

receive and process item
how to use

Settings

DOWNLOAD_DELAY = 3 be more friendly to scrapped site
USER_AGENT be more of a browser than a robot
ROBOTSTXT_OBEY = True

robotstxt in setting should be true if there is a robots.txt for the site, to be a good web citizen

r Paste_Image.png Paste_Image.png
上一篇下一篇

猜你喜欢

热点阅读