The more I use unit testing, the more I wander into areas that are more typically the province of true testers (rather than of developers). One area I frequently visit is the problem of combinatorial testing, which is how to test code where there is a large number of possible values it must handle. Let's say, I have a function with four parameters that are all boolean. There are, therefore, 16 possible combos. My temptation is to write 16 unit tests. But the concept of pairwise testing argues against this test-every-permutation approach. It is based on the belief that most bugs occur between the interaction of
pairs of values (rather than a specific configuration of three, or four of them). So, pair-wise experts look at the 16 possible values my switches can have and choose the minimum number in which all pairs of values have been exercised. It turns out there are 5 tests that will exercise every pair of switch combinations.
The question I've wondered about is if I write those five unit tests, rather than the more ambitious 16 tests, what have I given up? The answer is: not much. At the recent
Software Test & Performance Conference, I attended a session by BJ Rollison who heads up
TestingMentor.com when he's not drilling Microsoft's testers in the latest techniques. He provided some interesting figures from Microsoft's analysis of pair-wise testing.
For attrib.exe (which takes a path + 6 optional args), he did minimal testing, pairwise, and comprehensive testing with the following results:
Minimal: 9 tests, 74% code coverage, 358 code blocks covered.
Pairwise: 13 tests, 77% code coverage, 370 code blocks covered
Maximal: 972 tests, 77% code coverage, 370 code blocks covered
A similar test exploration with findstr.exe (which takes a string + 19 optional args) found that pairwise testing via 136 tests covered 74% of the app, while maximal coverage consisting of 3,533 tests covered 76% of the app.
These numbers make sense. Surely, if you test a key subset of pairs possibilities, testing additional combinations is not likely to exercise completely different routines, so code coverage should not increase much for the tests that exceed pair-wise recommendations. What surprised me was that pairwise got such high-numbers to begin with. 70+ % is pretty decent coverage.
From now on, pair-wise testing will be part of my unit-testing design toolbox. For a list of tools that can find the pairs to test, see
here. Rollison highly recommended Microsoft's free PICT tool (see previous link), which also provides a means to specify special relationships between the various factors in the combinations.