Java爬虫

2022-02-21  本文已影响0人  请叫我平爷
  1. 引入jsoup包
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.10.2</version>
</dependency>
  1. 获取内容
@Test
public void textDemo() throws Exception {
       String  url = "https://search.jd.com/Search?keyword=java&enc=utf-8";
       //解析网页
       Document document = Jsoup.parse(new URL(url),30000);
       Element element = document.getElementById("J_goodsList");
       //获取所有的li元素
        Elements elements = element.getElementsByTag("li");
        List<ProductBean> productBeanList = new ArrayList<>();
        //获取元素的内容,这里el就是每个li标签
        for (Element el : elements){
            String img = el.getElementsByTag("img").eq(0).attr("src");
            String price = el.getElementsByClass("p-price").eq(0).text();
            String name = el.getElementsByClass("p-name").eq(0).text();
            ProductBean bean = new ProductBean();
            bean.setImg(img);
            bean.setName(name);
            bean.setPrice(price);
            productBeanList.add(bean);
        }
        System.out.println(JSON.toJSON(productBeanList));
}
上一篇 下一篇

猜你喜欢

热点阅读