我爱编程

第八章 scrapy进阶

2018-01-04  本文已影响0人  Xia0JinZi

scrapy 进阶

标签(空格分隔): python scrapy selenium


selenium动态网页与请求

brower = webdriver.Chrome('E:/spider_tools/chromedriver_win32/chromedriver.exe')
brower.get('https://www.zhihu.com/#signin')
brower.find_element_by_css_selector('.qrcode-signin-step1 span.signin-switch-password').click()
brower.find_element_by_css_selector('.view-signin input[name="account"]').send_keys('13083337152')
brower.find_element_by_css_selector('.view-signin input[name="password"]').send_keys('jinquan1994')
brower.find_element_by_css_selector('.view-signin button.sign-button').click()
brower.quit()

注意:get加载完成才可以进行模拟点击,可以通过添加time.sleep(15)方式,让页面添加完成。

brower.execute_script('window.scrollTo(0,document.body.scrollHeight);var lenOfPage = document.body.scrollHeight;return lenOfPage;')
chrom_opt = webdriver.ChromeOptions()
prefimg = {"profile.managed_default_content_settings.images":2}
chrom_opt.add_experimental_option('prefs',prefimg)
brower = webdriver.PhantomJS(executable_path='E:/spider_tools/phantomjs-2.1.1-windows/bin/phantomjs.exe')
brower.get('https://www.zhihu.com/#signin')
telent ip:端口
scrapy crawl lagou -s JOBDIR = job_info/001 # 新的启动
crtl+c

上一篇下一篇

猜你喜欢

热点阅读