MOOC_Python网络爬虫与信息提取课程笔记(三)
2019-07-29 本文已影响0人
42c64edf12e9
from bs4 import BeautifulSoup#注意首字母要大写
soup=BeautifulSoup('<p>data</p>','html.parser')#<p>data</p>表示html信息,html.parser为HTML解析器
Beautiful soup库解析器:
![](https://img.haomeiwen.com/i16636256/3efd457044019a6b.png)
Beautiful soup 类的基本元素:
![](https://img.haomeiwen.com/i16636256/0c90320360dd4308.png)
html的标准格式
![](https://img.haomeiwen.com/i16636256/d3ee68a11f5bfe56.png)
![](https://img.haomeiwen.com/i16636256/5949fd02ca7858a8.png)
![](https://img.haomeiwen.com/i16636256/c4bb4b3faaf5ac77.png)
![](https://img.haomeiwen.com/i16636256/a74aaed374421a2a.png)
![](https://img.haomeiwen.com/i16636256/17652bbcfbe8acf6.png)
bs4的prettify方法:用于输出html的输出