Selenium is an automated testing tool for performing quality assurance on websites. Basically, it launches a browser (like Chrome or Firefox or Safari) and takes control of it, robotically visiting pages and clicking on things without human intervention. Let’s explore how it works and why it’s useful.

Installing Selenium

Note that Selenium comes in two flavours: advanced and basic. We’re only interested in the advanced one, called Selenium Webdriver (not to be confused with the less powerful Selenium IDE).

To use Selenium WebDriver, you just need to write a few lines of code in one of various supported languages. We’ll use Ruby but instructions are available for other languages as well. If you don’t yet have Ruby installed, consider using my rbenv installation method. Once Ruby is installed, you can just run:

gem install selenium-webdriver

Becoming a browser puppeteer

A fun way to experiment with Selenium is to open an interactive Ruby shell. From a terminal, run:


At the prompt, enter:

require 'selenium-webdriver'

This loads the gem. You should get back a response like this:

=> true

Next, launch Chrome:

driver = Selenium::WebDriver.for :chrome

You’ll see a result like:

=> #<Selenium::WebDriver::Driver:0x..fdb8814522c6b86b4 browser=:chrome>

Meanwhile, an actual browser window should appear, as pictured below. It often launches behind other windows, so you may have to hunt for it with Command+Tab (or Alt+Tab).

Blank Selenium-created browser

Now, for the magic:

driver.get ""

And it’s alive!

Launch DuckDuckGo

Try entering a search term:

element = driver.find_element(:id, "search_form_input_homepage")
element.send_keys "selenium webdriver help"

Notice that the browser has actually filled in the search field! Now send the enter key:

element.send_keys :return

And you see the results!

DuckDuckGo search results

Explore the Ruby bindings reference to see what else you can do. When you’re done, close the browser:


A practical example

Suppose you want to capture a screenshot of a given web page. The following script (called take-screenshot) accepts a URL as a parameter and uses Selenium to generate a screenshot image at a consistent size. For example:

./take-screenshot "" screenshot.png

By default, the script sets the browser size to 1024x768, but this can be adjusted by specifying additional parameters. For example, a mobile view can be obtained like this:

./take-screenshot "" screenshot.png 320 480

The result is a 320x480 sized view:

Mobile DuckDuckGo

Finally the script can also take a 5th parameter, if you would like to wait a number of seconds before taking the screenshot, in order to allow the page to fully render. See the full script:

#!/usr/bin/env ruby
require 'selenium-webdriver'

# Check that parameters are given
if ARGV.size < 2
  puts "Usage: #{File.basename($0)} /path/to/filename.png [width] [height] [sleep]"
  exit 1

# Extract parameters
url = ARGV[0]
filename = ARGV[1]
width = ARGV[2] || 1024
height = ARGV[3] || 768
seconds = ARGV[4].to_i

# File should not exist
if File.exist? filename
  raise "Destination file '#{filename}' already exists!"

# Generate screenshot
  driver = Selenium::WebDriver.for :chrome
  driver.manage.window.resize_to width, height
  driver.get url
  sleep seconds
  driver.save_screenshot filename

# File should now exist
if File.exist? filename
  puts "=> Screenshot saved: '#{filename}'"
  raise "Destination file '#{filename}' already exists!"

Note that the current version of this script (and others) can be found in this project’s Github repository.

Next steps

This illustrates how you can be become a browser puppeteer with little effort. Now identify some tedious web-based tasks and automate them! For quality assurance testing, you’ll want to organize your tests into units, as described in this post about Selenium, Ruby, Test::Unit and Rake. For general automation, just experiment and iterate.