Intro to Browser Testing

An in-depth overview of how to get started with browser testing via Webdriver.

When I started rewriting the build process for my website, one of my goals was to have a comprehensive set of integration tests and browser tests, meaning tests that would literally run in a real web browser.

As with testing in general, the theory behind controlling a browser is often presented as something arcane or difficult to learn. But it's a lot more straightforward than you might expect.

TLDR summary

There are three steps to controlling a browser through code:

  • launching a browser (preferably one that’s headless)
  • communicating with that browser over the Webdriver Protocol
  • navigating that browser to your own deployment

Many of the popular frameworks for browser testing are just wrappers around these three steps - so there's a lot of potential benefit to (occasionally) just rolling this code yourself.

What's a headless browser and where do I get one?

Most modern browsers ship with clients or have command line options to allow themselves to be controlled via an API. There's nothing inherently special that happens when a testing framework launches a browser. They're just finding the executable on your computer and then starting it with the proper command line arguments.

With browser testing, it really is the browser doing most of the heavy lifting. Tools like Selenium aren't going in and editing your computer memory - they're just interfaces between you and the browser that you run.

A headless browser isn't necessary for any of the code I'm about to show you. You can connect and control most browsers: Chrome, Firefox, even IE. A headless browser is a browser that launches without a graphical UI. Headless browsers are common in browser tests because they reduce the number of dependencies a browser needs to run and because they often require fewer system resources to run.

Plus they just feel cleaner, you know? They make it so you don't have an extra window popping up on your screen every time you run a test.

PhantomJS has gone down in popularity since Headless Chrome was announced, but it's still a lighter, easier browser to just include inside a Node project and then launch - especially since the Headless Chrome team hasn't technically gotten rid of all of their graphical dependencies yet. So we'll be using PhantomJS.

var phantomjs = require('phantomjs-prebuilt');
var browser = phantomjs.run('--webdriver=4444');

The above code launches a browser that I can connect to over port 4444. More on that in a bit.

Keep in mind that even in the example above, all that phantomjs-prebuilt is doing is launching a node process and running the phantomjs executable with a command line option. Nothing magical at all.

So now I need a library to control the browser?

Yes and no.

I mentioned above that it's your browser allowing you to control it — not Javascript. When we control the browser, we'll do so using the Webdriver Protocol.

Webdriver doesn't care about what language you're using. You could write your tests in Python, Java, Lisp, or even Bash. The browser is just listening for network requests and responding to them.

It's not a bad idea to learn how the webdriver protocol works, and I may do a lower-level tutorial in the future. But for our purposes it's overkill. In day-to-day testing you probably aren't interested in handling network requests yourself. Additionally, remember that we're talking about web browsers here, so support for the webdriver protocol is…​ not perfect.

We'll be relying on webdriverio to help us send those network requests. Think of it like jQuery; it's just an extra layer we can use to help standardize browser behavior until browser makers sort out all their crap.

Wait, so could these be separate processes?

I'm glad you asked, you're very clever!

You could send your webdriver requests from a completely separate process. Remember, we're just talking about rest requests here. You're never going to be directly interacting with the browser via Javascript. To drive that point home, the initialization for webdriverio will only ask us for a host and port.

var webdriverio = require('webdriverio');
var client = webdriverio.remote({
    desiredCapabilities : { browserName : 'phantomjs' },
    host : 'localhost',
    port : '4444'
});

//And now we can start sending browser commands.
var title = client.init().browse('https://google.com').title();

//Because we're sending requests over a network, our code becomes asynchronous.
title.then(function (title) {
    console.log(title);
});

The obvious followup question is "could my browser be running on a separate computer?" For example, Safari on a special Macintosh VM. Unfortunately, you may run into difficulties. Many browsers by default disable remote connections over the webdriver protocol. Sometimes you'll be able to override this, but often it's better to just set up a proxy server on your remote host and send requests through it instead of directly to the browser.

That's a topic for another tutorial, so in the meantime you'll just need to take my word for it that proxying requests is probably well within your capabilities.

Okay, I can control a browser. Where's my site?

You have a browser and you're controlling it — that doesn't mean your site will be magically deployed. You'll have to handle that step yourself.

Just like up above, this could be happening in a separate process or on a remote computer. It's also possible that if you're doing something like web scraping that you may not even have your own site to deploy, in which case you could skip this step.

In practice, if you're doing testing, you will likely want to be able to spin up a deployment of your site, test it, and then close that deployment in a single command. In the final example below, that's what I'll be doing.

But, if you're just fooling around or don't mind an extra step, feel free to just open up another terminal window and run Node's http-server.

var http = require('http');
var Static = require('node-static');

var staticServer = new Static.Server('./public'), { cache : 0 });
var server = http.createServer(function (request, response) {
    staticServer.serve(request, response, function (err, result) {
        if (err) {
            response.writeHead(err.status, err.headers);
            response.end();
        }
    }).resume();
}).listen(8000);

Putting it all together

Okay, we talked about the theory. Working implementation time!

var phantomjs = require('phantomjs-prebuilt');
var webdriverio = require('webdriverio');

var http = require('http');
var Static = require('node-static');

/* Create our static server and tell it to listen (Step 3) */
var staticServer = new Static.Server('./public'), { cache : 0 });
var server = http.createServer(function (request, response) {
    staticServer.serve(request, response, function (err, result) {
        if (err) {
            response.writeHead(err.status, err.headers);
            response.end();
        }
    }).resume();
}).listen(8000, 'localhost', async function () {

    /* Start a browser process (Step 1) - remember, this could be any browser, headless or not. */
    var browser = await phantomjs.run('--webdriver=4444');

    /* (Step 2) Our client is a thin wrapper around the webdriver protocols. Think of it like jQuery for webdriver requests */
    var client = webdriverio.remote({
        desiredCapabilities : { browserName : 'phantomjs' },
        host : 'localhost',
        port : '4444'
    });

    client.init(); //library specific API, nothing special going on here.

    test(client).then(async function () {
        console.log('test passed');

    }, function (error) { /* test failed */
        console.log(error);
        process.exitCode = 1;

    }).then(async function () { /* Clean up, regardless of what happens (Step 4) */
        await client.end();
        await browser.kill();
        await server.close();
    });
});

async function test (client) {
    //Whatever testing library you're using.
    await client.url('http://localhost:8000');
    var title = await client.getTitle();

    if (title !== 'My Awesome Title') { throw new Error('ERROR!'); }
}

You'll notice that there's one final step I didn't talk about above, and that's shutting everything down once your tests finish. You can always kill your processes yourself while you experiment, but for a final implementation you'll want to make sure your code actually completes at some point.

If you don't do this, your browser/site/client will stay connected after your tests finish. Occasionally, you may actually want that to happen if you're doing debugging.

Going farther

There are a few topics that are outside the scope this tutorial but that you should be a little more equipped to tackle now. If you're up for a challenge, try answering the following questions:

  • What would happen if I connected multiple webdriver clients at the same time?
  • What would be the primary difficulty I would face running my tests in parallel?
  • What would I need to do if I wanted to run multiple tests on a static site in parallel?
  • What would I need to do if I wanted to run multiple parallel tests on a dynamic site (for example one with user accounts or persistent state)?
  • What would I need to do if I wanted to use one command to run my tests on multiple browsers?
  • What if those browsers were on separate computers (ie Mac and PC)?

If you read this far, you're probably at least somewhat interested in low-level testing. If so, why not check out Distilled? It's a low-level, unopinionated testing framework that works great if you want to build your own test runners, syntax, or just cool stuff in general.