Python不同网络模块网页源代码的获取
2016-08-07 本文已影响40人
C_Y_
requests模块
req=requests.get(url)
source = req.text
或者使用
req.content
selenium模块
driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.get(url)
source = driver.page_source
BeautifulSoup模块
soup=BeautifulSoup(req.text,'lxml')
source=soup.content
webtext=soup.text
#BeautifulSoup对象的text属性是所有文本内容
urllib模块
response=urllib.opernner.open(url)
source = response.read()