Testing for software is hard. Good test cases can help you increase maintainability and stability of your code, while bad test cases not only can not benefit you, some things can even slow down your development process.
It is good to see many test cases written in our team now, but some of them are not really benefit our team and even become burden of the development. I think part of the reason is because we don’t have instructions for how to design good test cases that really help us out. So I’d like to build a basic framework about how to write test cases in UT and E2E.
In order to understand the reason why frameworks are constructed that way, we need to learn generic knowledge of testing first. We will discuss this in a bit.
In order to write good test cases, we need to understand our goals. There are lots of reasons why we need tests when we develop software. Here is the most obvious one:
As your software grows, you will inevitably need to verify more and more logic in order to ensure your modification of code is correct and doesn’t break something that already exists accidentally. Automatic testing is a good way to reduce the cost of verifying.
The key challenge of software testing are two things:
The system may seem to work fine across a broad range of inputs, and then abruptly fail at a single boundary point.
Therefore, test cases must be chosen carefully and systematically.
1 | /** |
What is a good test suite for this function?
Systematic testing means that we are choosing test cases in a principled way, with the goal of designing a test suite with three desirable properties:
Software usually has a wide range of input values that produce different behaviors in different ranges. We want to pick a set of test cases that are small enough to be easy to write and maintain and quick to run, yet thorough enough to find bugs in the program.
To do this, we divide the input space into partitions, each consisting of a set of inputs.
Then we choose one test case from each partition, and that’s our test suite.
The idea behind partitions is to divide the input space into sets of similar inputs on which the program has similar behavior.
Let’s look at abs(a)
function:
1 | /** |
We focus on where this function will produce different behaviors in input space:
a >= 0
, abs()
returns a
.a < 0
, abs()
returns -a
.So we can divide the input space a: number
into a < 0
and a >= 0
partitions like this:
And then we pick one test case from each partition and form our test suite:
1 | // case 1: negative input |
Bugs often occur at boundaries between partitions. Some examples:
Number.MAX_SAFE_INTEGER
or Number.MAX_VALUE
.Why are these boundaries dangerous? Here are two main reasons:
<=
instead of <
, or initializing a counter to 0 instead of 1.Number.MAX_SAFE_INTEGER
, for example, it suddenly starts to lose precision.So we can add another test case in our abs()
test suite:
1 | // case 1: negative input |
In the previous example, we use pick one input of a partition as one test case. That’s nice. But as the program becomes more complex and there are multiple dimensions of inputs, we might face the issue that even if we just pick one test case for one partition, the number of combinations of inputs is still overwhelming, so that it breaks the rule that we want our test suite to be small and fast.
1 | Let's take a look at the multiply example: |
This function has a two-dimensional input space, consisting of all the pairs of integers (a,b). According to the rules of multiplication, we can separate the input space into these partitions:
And then, we add the boundary cases:
After separate each dimension into:
We end up having a complex partitions graph like this:
If we pick one test case(dots on the graph) for each partition, there are 36
combinations. That’s a lot! This is a so-called combinatorial explosion. And this is only two dimensions.
How to solve it? We realize that our test cases increase as the dimensions increase: O(n) = s^n where s
is the partitions for each dimension, and n
is the number of dimensions. And we are repeatedly covering the same partition from a single dimension perspective. So we can treat the features of each input a
and b
as two separate partitions of the input space. One partition only considers one value:
1 | // partition on a: |
And then we combine those values together without repeating them. We can form a test suite that covers all partitions of each dimension, and the complexity won’t increase when dimensions increase: O(n) = s.
This indeed increases the risk of bugs, but we can add another layer of partition to cover some of the combinations:
1 | // a and b are both positive, a is a small integer, b is a LARGE_NUMBER |
Software Engineering, Testing — Apr 10, 2024
Made with ❤ and at Earth.