Python, Selenium, And ChromeDriver: Automating Web Interactions - Part 2

Let's continue our exploration of possibilities and tricks using Selenium and ChromeDriver via Python. In the 2nd installment of this series, I will touch upon useful techniques to interact with the specific elements you need in the web page or user interface.

Find elements using XPath

In the last example of our first article, we saw how to get the "div" elements of the search results page and extract their text. You may have noticed that such a search results into a big list of elements, and you may not need some of the content items listed. The alternative could be to adjust the selection by using a CSS selector, via find_elements_by_css_selector(). That is if you can find a CSS selector that works for your use case.

There is a third way to target elements in an HTML page more precisely, and which I use a lot: using an XPath path expression, which when evaluated, points to the elements in the HTML structure. And that's the technique we are going to see now.

So, let's say we want to get only the "Events" data from the website (including the Python software releases information). Instead of using the find_elements_by_tag_name() function, we would use find_elements_by_xpath(), and pass the right XPath path expression to it.

If you look at the source code, these content items are listed using the <li> tag contained in a <ul> tag with the attribute class="list-recent-events menu".

In the XPath way of thinking, you could target these elements using the path expression //ul=[@class='list-recent-events menu']. The double / at the beginning makes sure the scope of the search starts below the contextual node (the HTML structure is viewed as a tree with nodes).

As a side note, you can quickly grasp XPath path expressions, at least for common cases, using these tips and resources:

  • When analyzing the HTML source of the page in Firebug or Chrome developer tool, you can quickly identify the XPath expression you need.
  • A free tutorial provided by Altova that you can find from this page.

Back to our code, using driver.find_elements_by_xpath("//ul=[@class='list-recent-events menu']") should get us the list of the Event articles listed in the page.

The new version of the code would look like:

from contextlib import closing

from selenium import webdriver
from selenium.webdriver.common.keys import Keys 
from selenium.webdriver.chrome.options import Options

options = Options()
options.set_headless(headless=True)

with closing(webdriver.Chrome(chrome_options=options)) as driver:
    driver.get("http://www.python.org") 

    elem = driver.find_element_by_name("q")
    elem.clear()
    elem.send_keys("pycon")
    elem.send_keys(Keys.RETURN) 

    # Get Event articles
    elems = driver.find_elements_by_xpath("//ul=[@class='list-recent-events menu']")
    for elem in elems:
        print(elem.text)

When the driver does not find elements in the page

Sometimes, the page content may take a bit of time to load, as it happens when accessing a page using a normal browser, resulting in the driver not finding them.

A trick you can use to improve things in this situation, is to set a load timeout for the driver. You basically tell the driver to wait a certain time when it does not find the elements, before throwing errors. You do that by calling driver.set_page_load_timeout(10), for example if you want it to wait for 10s, just after it's creation.

...

options = Options()
options.set_headless(headless=True)

with closing(webdriver.Chrome(chrome_options=options)) as driver:
    driver.set_page_load_timeout(10)

    # Do the work here...

Action Chains

There are cases where you cannot find the elements to interact with just by looking at the page's HTML; you need to perform some mouse action, that will result in showing the element(s). That is the case with some navigation menus and a number of JavaScript-enabled interactions.

Selenium has a facility for that, called Action Chains. From the documentation...

""" ActionChains are a way to automate low level interactions such as mouse movements, mouse button actions, key press, and context menu interactions. This is useful for doing more complex actions like hover over and drag and drop. """

Let's see a small example: We could simulate the mouse going on the Downloads submenu and clicking on the License entry to show the License page. This would be possible with the following snippet, using ActionChains:

menu_elt = driver.find_element_by_css_selector("#downloads")
menu_link_elt = driver.find_element_by_css_selector("li.element-6")

actions = ActionChains(driver)
actions.move_to_element(menu_elt)
actions.click(menu_link_elt)
actions.perform()

And the complete code example:

from contextlib import closing

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains

options = Options()
options.set_headless(headless=True)

with closing(webdriver.Chrome(chrome_options=options)) as driver:
    driver.get("http://www.python.org")

    menu_elt = driver.find_element_by_css_selector("#downloads")
    menu_link_elt = driver.find_element_by_css_selector("li.element-6")

    actions = ActionChains(driver)
    actions.move_to_element(menu_elt)
    actions.click(menu_link_elt)
    actions.perform()

    assert "History and License" in driver.page_source

Wrapping up

That's it for today. So far, you got enough information to start playing with these techniques and even start trying them in your pet projects. Or if you are a lean entrepreneur, there is info here that will help, if you need to get someone to work on the task for you.

comments powered by Disqus

Need help for your project?

I can contribute to your project, working on a specific task, or doing all the coding based on your specifications using a web framework / CMS.