I’ve been writing automated tests at work for about 10 years now. They have certainly evolved over the years. At first, I didn’t put much thought into the code quality and expressiveness of the tests as opposed to the production code. However, I soon came to realize the importance to highly readable and maintainable test code. It needs to be readable, because often the tests can be used as living documentation for the system. The test code needs to be maintainable because systems change over time and updating tests shouldn’t be dragging you down.
If you follow me or my blog, you shouldn’t be surprised by what I’m going to say. The tests need to follow Beck’s Four Rules of Simple Design. The tests need to 1) work, 2) have a single source of truth (DRY), 3) be expressive, and 4) be small.
And derived from Beck’s rules, here are four rules I try to follow when writing automated tests:
- Single Assert Per Test
- Scientific Control
- No Technical Details
- No Dead Code
And as an example, here is the perfect test:
public class PostOrderServiceTests
{
[Fact]
public void HappyPath()
{
//Arrange
var order = CreateTypicalOrder();
//Act
PostOrder(order);
//Assert
GetPostedInvoices().ShouldNotBeEmpty();
}
}
Each test should be expressive, have a single responsibility, and be small. A reader should be able to see very quickly what the output of the system under test should be. Having too many asserts in a test method can make the code noisy and distract from the main message of the author.
The experimentation process that is automated testing should be controlled so that the impact of the system inputs is easily understood and proven.
There should be a Control test. This serves as a Control group, like would be found in any experiment in all other fields of science. This is listed above as the HappyPath test method. This test offers a baseline for the normal operation of the system under test.
Only after we have this baseline can we then safely perform other experiments. Here is the next test:
[Fact]
public void Void_NotPosted()
{
//Arrange
var order = CreateTypicalOrder();
order.Status = Status.Void;
//Act
PostOrder(order);
//Assert
GetPostedInvoices().ShouldBeEmpty();
}
In this test there is quite a bit of code shared with the baseline test. Both tests call the CreateTypicalOrder
, PostOrder
, and GetPostedInvoices
methods. The CreateTypicalOrder
method sets up every experiment in a consistent, repeatable way. The PostOrder
executes the test the same standardized way. GetPostedInvoices
observes the results in a way that doesn’t impact the system under test, or at least the impact would be consistent for all tests.
With the common baseline established the effect of the input parameter can be seen. The only change in how the test is setup between this test and the baseline is that here we set the Status of the order to Void. We know that from the baseline a typical order would be posted as an invoice (ShouldNotBeEmpty
). But result of this test is that no invoices were posted (ShouldBeEmpty
). Because of this scientific control, we have proven that setting the Status to Void causes the order not to be posted as an invoice.
The scientific control ensures all four of Beck’s rules are followed. We have proven the system works. There is a single source of truth for how tests are setup, executed, and observed. The test is expressive as to what is being tested, and it is small.
There should be no technical details exposed in the test methods. Those technical details are abstracted into sub-methods. In this example those technical details are implemented by the methods: CreateTypicalOrder
, PostOrder
, and GetPostedInvoices
.
The test method so expressive, in many cases it could be shown to non-technical people. The tests can serve as living system documentation.
In this example we don’t know by looking at the test method if the CreateTypicalOrder
creates the order as a record in a database, an in-memory or mocked database, or is it just a POCO class. Nor from the method name PostOrder
do we know how the order is posted. Is it via a service class, a static method, to a message queue, or a microservice function? This points to how flexible our test suite is. If we want to change between any of these possibilities later, we can do so without changing any of the top-level tests.
Because the technical details are hidden in the sub methods, this makes the tests easier to maintain. A code smell I see in some code reviews is that a change to the system causes many tests to change in the same identical way. This smell points to the setup or execution technical details of the test not being properly abstracted. By keeping these details in separate sub-methods they are easily changed.
I could go on and on regarding various strategies to abstract out the technical details. But I’ll stop at just a couple tips.
Be familiar with Factory Method, Abstract Factory, Builder, and other creational design patterns. These are your friends when it comes to the setup/arrange section of your test.
Don’t be afraid to use multiple test classes if different setup or execution methods are needed. There is no need to follow a rule of a one-to-one relationship between the classes under test and test classes. Use inheritance of test classes or composition share these methods between test classes and offer override abilities where needed. Maintain the single source of truth across all test classes.
Create a Factory Method in the test class to do the common setup for the class. In this case it is the CreateTypicalOrder
method. That will set the properties of the Order to a state that is ready to post. That Factory Method can in turn call a Test Record Factory that is shared between multiple test classes. The Test Record Factory can take care of details, like required Customer parent records.
Lastly, tests (or any code) shouldn’t contain any dead code.
There probably will be some shared setup (i.e. the constructor) of the test class. This should just setup the services needed to execute the test. They shouldn’t be creating numerous test records that may or may not be used in the test.
A code smell I sometimes see is the setup of a test classes creating 5, 10, or more test records, even though most or all tests only use a single test record. Code maintainers not only need to read the tests, but they have to memorize the setup and make sure their test is using the correct test record.
This extra code to create all the extra record detracts from the expressiveness of the test and can also hurt the performance. The CreateTypicalOrder
method can be called twice if two test records are needed for a test.