BeautifulSoup 库的使用

2017-09-20 本文已影响0人柠檬丸子

用于解析、遍历、维护“标签树”的库
用于解析html
from bs4 import BeautifulSoup
import request
r=requests.get("http://www.baidu.com")
demo=r.text #返回的是一个html的文件
soup=BeautifulSoup(demo,'html.parser') #解析html信息

另外一种方式 soup=BeautifulSoup(open('D://demo.html'),'html.parser'))

tag=soup.a #找到a标签
print(tag.attrs['class']) #查看a标签的属性信息
print(tag.string)

用这个网址测试：https://python123.io/ws/demo.html

soup.title

soup.a.parent.name

soup.p.parent.name

image.png