电商旅游类爬虫接口汇总

2017-08-29  本文已影响0人  名字太逗无法显示

融e购

首页大分类xpath:
//h3/a/@href
列表页:
//div[@class="p-name"]/a/@href
详情页:
评论api:
http://mall.icbc.com.cn/products/productEvaluations.jhtml?DisplayOrder=20&productId=0000475292

国美

首页大分类xpath:
//h3/a/@href
列表页
//p[@class="item-name"]/a/@href
详情页

样例链接:https://item.gome.com.cn/A0006124481-pop8009586601.html?intcmp=list-9000000700-1_3_1

评论api
https://ss.gome.com.cn/item/v1/prdevajsonp/appraiseNew/A0006124481/1/all/0/10/flag/appraise

飞牛网

首页大分类xpath:
//h3/a/@href

样例链接:http://item.feiniu.com/KS1170590300255535?tp=list.C27303.2007-item1.1.1503907640057Fdt1

价格api:
http://item.feiniu.com/price_qty_sku?sku_seqs=KS1170590300255535,KS1170590300255639
评论api:
http://item.feiniu.com/getCommentsByStarForAll?goodsId=PS11705300022494&goodsType=1&getType=0&curPage=1&pageSize=10&v=1503910789000

亚马逊

样例链接:https://www.amazon.cn/%E6%89%8B%E6%9C%BA-%E9%80%9A%E8%AE%AF/dp/B01LX5KR6D/ref=sr_1_3?s=wireless&ie=UTF8&qid=1504071662&sr=1-3&th=1
所有评论链接:https://www.amazon.cn/Apple-iPhone-7-128G-%E9%BB%91%E8%89%B2-%E7%A7%BB%E5%8A%A8%E8%81%94%E9%80%9A%E7%94%B5%E4%BF%A14G%E6%89%8B%E6%9C%BA/product-reviews/B01LX5KR6D/&reviewerType=all_reviews&sortBy=recent

携程

模板不一致,部分不需要动态获取,具体问题具体分析
样例链接:http://taocan.ctrip.com/freetravel/p2897083s1.html

评论api:
#post请求
url = 'http://online.ctrip.com/restapi/soa2/12447/json/GetCommentInfoList'
payload = {
    "CommentLevel": "0",
    "PageIndex": "1",
    "PageSize": "5",
    "ProductId": "2897083",
    "channelCode": "0",
    "platformId": "4",
    "version": "70400",
}

去哪儿

较难获取

样例链接:http://taocan.ctrip.com/freetravel/p2897083s1.html

评论api:
headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3',
    'Content-Length': '65',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0',
    'Host': 'qlhd2.package.qunar.com',
 }
url = 'https://qlhd2.package.qunar.com/user/comment/product/queryComments.json'
body_value = {
    "type": "all",
    "pageNo": "1",
    "pageSize": "10",
    "productId": "1460660174",
    "rateStatus": "ALL",
}
body_value  = urllib.urlencode(body_value)
request = urllib2.Request(url, body_value)
for keys in headers.keys():
    request.add_header(keys, headers[keys])
result = urllib2.urlopen(request ).read()
print result

飞猪网

样例链接:https://items.fliggy.com/item.htm?spm=181.7621407.a1z9b.18.255442a8qRygIz&id=38333678815&scm=20140635.1_1_4.0.0b83e29515039939125018002e1eed

评论api:
https://rate.tmall.com/list_detail_rate.htm?itemId=38333678815&spuId=0&sellerId=1112797297&order=3&currentPage=1
# 页面中 sellerId       : '1112797297',

途牛网

样例链接:http://www.tuniu.com/tour/210135150

评论api:
http://www.tuniu.com/papi/product/remarkList?productId=210135150&productType=1

驴妈妈

评论在源码中

上一篇 下一篇

猜你喜欢

热点阅读