python爬虫日记本Python 爬虫专栏爬虫专题

scrapy同时运行多个爬虫

2017-06-20  本文已影响467人  BlueCat2016

在工程根目录下创建start_spiders.py

#coding=utf8
# -*- coding: utf-8 -*-
import os
# 必须先加载项目settings配置
# project需要改为你的工程名字(即settings.py所在的目录名字)
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', 'project.settings')
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
process = CrawlerProcess(get_project_settings())
# 指定多个spider
# process.crawl("board_spider")
# process.crawl("favorite_spider")
# 执行所有 spider
for spider_name in process.spider_loader.list():
    # print spider_name
    process.crawl(spider_name)
process.start()

参考文档:http://blog.leanote.com/post/dapingxia@163.com/Python%E7%88%AC%E8%99%AB%E8%BF%9B%E9%98%B63%E4%B9%8BScrapy%E8%BF%90%E8%A1%8C%E5%A4%9A%E4%B8%AASpiders-2

上一篇下一篇

猜你喜欢

热点阅读