【week2】day1:mongoDB的简单使用

2016-09-08  本文已影响0人  霍淇三公子

1,基础知识

import pymongo
client = pymongo.MongoClient('localhost', 27017)
walden = client['walden'] #创建表格文件
sheet_lines = walden['sheet_tag'] #创建表格中的sheet
 l  == less; g ==greater; e == equal; n == not

2,practice

爬取小猪租房中前三页的房源信息,并筛选出价格高于500RMB的房源

The Code:

import pymongo, requests, time
from bs4 import BeautifulSoup

client = pymongo.MongoClient('localhost', 27017)
walden = client['2_1homework']
sheet_lines = walden['2_1homework']

urls = ['http://bj.xiaozhu.com/search-duanzufang-p{}-0/'.format(i) for i in range(1, 4)]

def get_details(url, data = None):
    wb_data = requests.get(url)
    soup = BeautifulSoup(wb_data.text, 'lxml')
    titles = soup.select('#page_list > ul > li > div.result_btm_con.lodgeunitname > div > a > span')
    prices = soup.select('#page_list > ul > li > div > span.result_price > i')
    #print(titles, prices)
    for i in range(len(titles)):
        index = i
        title = titles[i].get_text()
        price = prices[i].get_text()
        data = {
            'index' : index,
            'title' : title,
            'price' : float(price)
        }
        #print(index, title, price)
        sheet_lines.insert_one(data)

def find_price(url, data = None):
    for item in sheet_lines.find({'price': {'$gte' : 500}}):
        print(item['title'])

for url_single in urls:
    get_details(url_single)
    find_price(url_single)
    time.sleep(2)

3, 总结与反思

需要注意的几点:


Practice makes perfect!

上一篇下一篇

猜你喜欢

热点阅读