Thursday, February 04, 2010

Keeping LOC and Tests in Balance

The proliferation of metrics in software development threatens to take important quantitative measures and bury them beneath an avalanche of noisy numbers. Consequently, it's important to look for certain ratios and trends among the numbers to inform you whether a project is healthy. One tell-tale relation links LOCs and number of tests. These two values should grow in direct proportion to each other.

The included diagram presents the ratio of these two values for Platypus, the OSS project I work on.

As you can see, except for a few dips here and there, these numbers have stayed in lock step for the last 18 months. And, as you might expect, code coverage from these tests has similarly remained in fairly narrow range--right around 60%.

The most typical violation of this ratio is, as you would guess, a jump in LOCs without a corresponding rise in tests. This is something managers should watch out for. With a good dashboard, they can tell early on when these trend lines diverge. This is frequently, but not always, always indicative of a problem. (For example, it could be that a lot of code without tests was imported to the project.) Whatever the cause is, managers need to find out and respond accordingly.

(For the record, the tests counted in this diagram include unit tests and functional tests.)


Jesse Gibbs - Atlassian said...

Interesting and useful metric!

I'm glad to see a more intelligent approach to measuring risk than simply looking at the absolute code coverage %.

A few more ideas for measurement and detection that can lead to less risk in your code:

* Measure coverage or number of tests vs. complexity of code (more complex code needs more tests to cover all paths).

* Detect code that has recently changed (new or updated) and lacks tests. This code is not as 'hardened' and thus introduces more risk.

* Look for code that has lost coverage since previous runs. In Atlassian's Clover tool for Java code coverage, we refer to these as 'Movers'.

* If you are concerned with absolute coverage (a fairly unimportant and abused metric, IMO), then take advantage of the ability to exclude trivial code from your coverage report. This eliminates the incentive to right trival tests that only serve to raise coverage.

santosh said...

Useful tips.

Alex Ooi said...

surely lines of tests should exceed LOC...

In projects I've worked on which have had high code coverage I've often found that we maybe writing 2-3 times more lines of tests than code. And this seems perfectly reasonable to me, given that tests need test data to be setup before a test and assertions after a test.

And in order to test for all permutations, it is often required that there are several Test methods for one particular Production method (for example, testing how a method handles a null, blank and non-blank string).

However, it is comforting to see more people pushing the case for writing automated tests. And the use of metrics to monitor lines of code vs tests is even more refreshing!

Kathleen Erickson said...

Another useful metric is LOC vs Cost to write it. Some significant savings are to be had by not writing, documenting, testing and maintaining code that could be auto-generated. My guess is that reducing LOC could significantly improve ROI (and application performance) - IT just needs to see how much they can save. We include an ROI calculator in our WebORB product. It's nice to see LOC addressed in this blog.

Hannah said...

Amazing Blog. Thanks for sharing such a good post.