Testing

Types of Tests Types of Tests

It’s important to know that there’s many different types of tests you can write. The most common types are integration tests, unit tests, behavioural tests (which includes accessibility tests). Tests can be a combination of these, or purely one type. While most of these tests can be automated, some involve a manual step, particularly behavioural testing.

Integration tests are tests that run a piece of code, as well as the code it uses. For example, this might run a function you’ve written that creates a post in WordPress. The test would actually create the post, which tests that your code correctly runs the code you’re integrating with.

Unit tests test a single layer of your code. The dependencies of that code would be mocked out (replaced by fake versions), which would ensure you’re only testing the code in a single function (for example). Unit tests are typically as small as possible to test a single thing.

Behavioural tests are tests that test user actions against your code. This may be a series of steps, where you check that the result matches what you expect after running the actions.

Integration and unit tests in PHP are typically written using PHPUnit, while behavioural tests are written with Behat. (Behavioural tests for sites can also use external tools, like Pingdom’s Transaction Monitoring, however these are a tradeoff of better UI for worse control.)

Accessibility testing consists of frontend (DOM) validation, keyboard testing and screen reader testing. DOM validation can be partially automated, while keyboard and screen reader testing still needs to be done manually.

There are many other types of tests, including visual regression testing, as well as tonnes of alternative tools, but we don’t tend to use those. Feel free to explore the other options available to see if anything better suits your project.

What tests should I write? What tests should I write?

It’s entirely subjective as to what sort of tests you should write, and how many tests you should write. This section is my opinion. – RM

Depending on what sort of project you want to test, the mix of tests may be different:

  • For pure libraries, unit tests are typically the best way to ensure that your library behaves as expected. Libraries are expected to be self-contained units, although integration tests should be used to ensure your library interfaces with other libraries or external services correctly.

  • For plugins, integration tests are mostly what you want to write. The entire purpose of most plugins is to interface with WordPress. Although writing unit tests may help here too, the effort required to mock out parts of WordPress is mostly not worth it, and you’ll likely need lots of integration tests anyway. A lot of the benefit of testing here comes from integration tests.

  • For themes or entire sites, behavioural tests will give the best value. These test that your site/theme behaves the way you expect, with links, elements, etc existing where they should on the rendered site. Unit or integration tests are not usually as important here.

Each project will have its own requirements for the numer and type of tests you need. Large site builds may have unit and integration tests for underlying plugins and libraries, and behavioural tests for the theme as a whole.

Typically, libraries and plugins should aim for 100% test coverage, whether unit or integration tests. The specific nature of these tests depends on the plugin at hand. On the other hand, themes and sites warrant much less testing, as they’re immediately user-facing, and are less likely to break in subtle ways.

It’s important to always be mindful of how much time and effort testing takes, and to be pragmatic. It’d be great to have the entire codebase of every site 100% unit tested, but that’s often not realistic. Instead, test the underlying custom behaviour (plugins), and use behavioural tests to test edge-case functionality. Redistributable projects (open source plugins and libraries, e.g.) should be generally held to a higher standard, as they’re much more likely to hit edge cases.