Scrapy发起Http 2.0请求

2022-08-17 本文已影响0人会爬虫的小蟒蛇

以 https://match.yuanrenxue.com/api/match/17?page=1 网站为例
注：https://www.hgy209.com/xmgs这个网站也是h2

1.png

这里可以看到他所使用的协议是http2

如果你使用http1协议去请求就会报错

局部配置http2

import scrapy


class TestSpider(scrapy.Spider):
    name = 'test'
    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            'https': 'scrapy.core.downloader.handlers.http2.H2DownloadHandler',
        }
    }

    def start_requests(self):
        yield scrapy.Request(
            url="https://match.yuanrenxue.com/api/match/17?page=1",
            callback=self.parse,
        )

    def parse(self, response):
        print(response)

这样就可以轻松抓到通过http2传输的数据

这里演示是使用的Spider局部添加的方法

全局配置http2

如果想要全局添加可以直接修改Settings.py

DOWNLOAD_HANDLERS = {
     'https': 'scrapy.core.downloader.handlers.http2.H2DownloadHandler',
}

Scrapy发起Http 2.0请求

局部配置http2

全局配置http2

猜你喜欢

热点阅读