python爬虫

服务器采用scrapyd 部署scrapy项目并设置定时任务

2019-11-19  本文已影响0人  嗨_小罗哥

开始之前轻自行安装好python和创建好虚拟环境

将代码上传到服务器

具体步骤

[scrapyd]
eggs_dir    = eggs
logs_dir    = logs
items_dir   =
jobs_to_keep = 5
dbs_dir     = dbs
max_proc    = 0
max_proc_per_cpu = 10
finished_to_keep = 100
poll_interval = 5.0
bind_address = 0.0.0.0
http_port   = 6800
debug       = off
runner      = scrapyd.runner
application = scrapyd.app.application
launcher    = scrapyd.launcher.Launcher
webroot     = scrapyd.website.Root

[services]
schedule.json     = scrapyd.webservice.Schedule
cancel.json       = scrapyd.webservice.Cancel
addversion.json   = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json  = scrapyd.webservice.ListSpiders
delproject.json   = scrapyd.webservice.DeleteProject
delversion.json   = scrapyd.webservice.DeleteVersion
listjobs.json     = scrapyd.webservice.ListJobs
daemonstatus.json = scrapyd.webservice.DaemonStatus

设置定时任务:

#!/bin/bash
source /home/python_env/env/bin/activate
cd /home/src/U17
curl http://123.56.16.18:6800/schedule.json -d project=U17 -d spider=yaoqi

1.第一行表示进入到虚拟环境
2.第二行表示进入到自己scrapy项目
3.第三行表示运行命令

出现Redirecting to /bin/systemctl start crond.service,
即service crond start 需要替换为systemctl方式
50 15 19 * *  sh /home/shell/U17.sh > /home/shell/spider.log

展示运行效果图

image.png
上一篇 下一篇

猜你喜欢

热点阅读