Python不同网络模块网页源代码的获取

2016-08-07 本文已影响40人 C_Y_

requests模块

req=requests.get(url)
source = req.text

或者使用

req.content

selenium模块

driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.get(url)
source = driver.page_source

BeautifulSoup模块

soup=BeautifulSoup(req.text,'lxml')
source=soup.content
webtext=soup.text
#BeautifulSoup对象的text属性是所有文本内容

urllib模块

response=urllib.opernner.open(url)
source = response.read()

上一篇下一篇

猜你喜欢

热点阅读