只用看此篇系列——scrapy批量运行爬虫怎么搞?

2020-03-01  本文已影响0人  安森老叔叔

开干

1、在spiders同级创建任意目录,如:commands
2、在其中创建 crawlall.py 文件 (此处文件名就是自定义的命令)


image.png

3.crawlall.py

from scrapy.commands import ScrapyCommand
from scrapy.utils.project import get_project_settings
 
 
class Command(ScrapyCommand):

    requires_project = True
 
    def syntax(self):
        return '[options]'
 
    def short_desc(self):
        return 'Runs all of the spiders'
 
    def run(self, args, opts):
        spider_list = self.crawler_process.spiders.list()
        for name in spider_list:
            self.crawler_process.crawl(name, **opts.__dict__)
        self.crawler_process.start()

4.settings.py文件中添加👇👇👇

# COMMANDS_MODULE = ‘项目名称.目录名称’ 
COMMANDS_MODULE = 'article.commands'

5.terminal中执行

scrapy crawlall

到此为止!

上一篇下一篇

猜你喜欢

热点阅读