python selenium 爬虫

网页元素查找

2019-07-17  本文已影响0人  你好_兔先生

1.1、查找一个元素

1.2、查找多个元素


网页

<html>
 <body>
  <form id="loginForm">
   <input name="username" type="text" />
   <input name="password" type="password" />
   <input name="continue" type="submit" value="Login" />
   <input name="continue" type="button" value="Clear" />
  </form>
</body>
<html>

代码:

login_form = driver.find_element_by_id('loginForm')
username = driver.find_element_by_name('username')
password = driver.find_element_by_name('password')

私有方法:

from selenium.webdriver.common.by import By

driver.find_element(By.XPATH, '//button[text()="Some text"]')
driver.find_elements(By.XPATH, '//button')

1.3、XPATH 高级元素查找

<html>
 <body>
  <form id="loginForm">
   <input name="username" type="text" />
   <input name="password" type="password" />
   <input name="continue" type="submit" value="Login" />
   <input name="continue" type="button" value="Clear" />
  </form>
</body>
<html>
login_form = driver.find_element_by_xpath("/html/body/form[1]")
login_form = driver.find_element_by_xpath("//form[1]")
login_form = driver.find_element_by_xpath("//form[@id='loginForm']")
  1. 绝对路径查找,不推荐。网页一改就容易出错。

  2. HTML中的第一个元素

  3. id='loginForm'

username = driver.find_element_by_xpath("//form[input/@name='username']")
username = driver.find_element_by_xpath("//form[@id='loginForm']/input[1]")
username = driver.find_element_by_xpath("//input[@name='username']")
clear_button = driver.find_element_by_xpath("//input[@name='continue'][@type='button']")
clear_button = driver.find_element_by_xpath("//form[@id='loginForm']/input[4]")
了解更多

1.4、超链接文本定位

<html>
 <body>
  <p>Are you sure you want to do this?</p>
  <a href="continue.html">Continue</a>
  <a href="cancel.html">Cancel</a>
</body>
<html>
continue_link = driver.find_element_by_link_text('Continue')
continue_link = driver.find_element_by_partial_link_text('Conti')

1.5、通过标记名字(Tag Name)定位

<html>
 <body>
  <h1>Welcome</h1>
  <p>Site content goes here.</p>
</body>
<html>
heading1 = driver.find_element_by_tag_name('h1')

1.6、通过Class名字定位

<html>
 <body>
  <p class="content">Site content goes here.</p>
</body>
<html>
content = driver.find_element_by_class_name('content')

1.7、通过CCS选择器定位

<html>
 <body>
  <p class="content">Site content goes here.</p>
</body>
<html>
content = driver.find_element_by_css_selector('p.content')
上一篇 下一篇

猜你喜欢

热点阅读