Sunday, November 25, 2007

The Fallacy of 100% Code Coverage

While I love RSS for aggregating feeds from various blogs, nothing beats having an expert combing through articles and posts, culling the best ones. Few people, if any, do that culling better for software development than Andy Glover of Stelligent. His blog posts weekly collections of interesting links to tools and agile development topics. It's one of the first RSS items I read. (Fair warning, Glover makes his selections livelier by describing them with terms from the disco era.)

A meme that has appeared recently in his links is a curious dialog about the value of 100% code coverage in unit tests. It was not until I read those posts that I realized there were people still unclear on this issue. So, let's start at the foundational level: 100% code coverage is a fallacious goal. Unit testing is designed to provide two principal benefits: 1) validate the operation of code; 2) create sensors that can detect when code operation has changed, thereby identifying unanticipated effects of code changes. There is no point in writing tests that do not fulfill one of the two goals. Consequently, a getter or setter should not be the target of a unit test:

public void setHeight( float newHeight )
{ height = newHeight; }

This code cannot go wrong (unless you believe that your language's assignment operator doesn't work consistently ;-). Likewise, there is no benefit in a test as a sensor here. The operation of this code cannot change. Hence, any time spent writing a unit test for this routine is wasted.

Developers who can't resist the allure of 100% code coverage are, it seems, seduced by one of two drivers:

1) their coverage tool gives classes with 100% code coverage a green bar (or other graphical satisfaction). Describing this phenomenon, Cedric Beust, the author of the TestNG testing framework, writes in his just released (and highly recommended) book, Next Generation Java Testing, that this design of coverage tools is "evil." (p. 149) And later, he correctly piles on with: "Even if we were stupid enough to waste the time and effort to reach 100% code coverage..." The problem with coding for graphical trinkets is explained in the next driver.

2) if developers attain 100% code coverage--even at the cost of writing meaningless tests--they can be certain they haven't forgotten to test some crucial code. This viewpoint is the real illusion. By basing proper testing on 100% code coverage, the developer has confused two issues. It's what you're testing and how (so, quality) that determines code quality, not numerical coverage targets (quantity). Capable writers of unit tests know that some routines need validation through dozens of unit tests. Because these tests repeatedly work the same code, they don't result in greater test coverage, but they do result in greater code quality. By pushing for an artificial goal of 100% a developer is incentivized against writing multiple tests for complex code, in order to have the time to write tests for getters and setters. That surely can't be right.

17 comments:

David said...

via dzone
as you hint, code coverage is not the same as code pathways, but while 100% coverage is generally achievable although of dubious use, testing all pathways is not typically feasible.

Stephan said...

"By pushing for an artificial goal of 100% a developer is incentivized against writing multiple tests for complex code, in order to have the time to write tests for getters and setters. That surely can't be right."

It's not artifical to request 100% CC coverage. With 100% CC coverage all your decisions are covered.

Beside that, most people I know use reflection to test setters and getters automatically. So no need to write any code to test basic setters and getters. And when they evolve and contain new business logic, you detect that by failing automatic tests and you can write a custom unit test (which you should as it is no longer a simple getter/setter).

Beside that I believe it's easier to keep your coverage at 100% than to keep it at 80%.

Peace
-stephan

--
Stephan Schmidt :: stephan@reposita.org
Reposita Open Source - Monitor your software development
http://www.reposita.org
Blog at http://stephan.reposita.org - No signal. No noise.

Mr. Neighborly said...

I think you have a poor view of what testing is. Testing (if you're doing the test-first/driven type at least) is supposed to be used to specify behavior, not to verify it.

So, if I'm creating a getter or setter, I want to specify that my class should have that before I write the class. I specify it, run the test, it fails. I write the getter/setter, re-run the test, all tests pass, the class's behavior is complete.

Now with these tests I can hand them to someone who doesn't necessarily know the system through and through or possibly even know how to write code and they can basically look over the function names (or if you're doing BDD, the spec names) and tell me if I'm on the right track. I can also 100% verify that I haven't broken something on a dependent code path by overriding a method, etc.

100% code coverage is possible and is something to strive for, but I think making it a requirement (as some shops do these days) is silly.

Andrew said...

This is preposterous. It's as if you're suggesting that the only way to write good code is to write the code well. Next you'll be saying that the only way for management to ensure their programmers are putting out quality is to hire good programmers.

Clearly an insupportable position.

Anonymous said...

Writing tests for setters and getters can be needed, but not in compiled languages. Dynamic languages like python cannot always verify that simply stuff like a setter or getter does what you want without a unit test.

dre said...

If you feel that the problem is only with getters and setters, then why not just ignore them in the code coverage report?

Mike Slattery posted about re-targeting code coverage, and provided a patch for Cobertura to ignore simplistic getters and setters.

I do think that code coverage allows you to learn how to write better unit tests. Code coverage therefore isn't a means to an ends, or any end goal. It's a tool, just like unit testing, to further code quality and improve the results of testing efficiently.

For example, if you're writing unit tests, find a huge cluster of bugs (no matter the size of your codebase), but have only received 10% code coverage (decision-condition coverage, not just statement coverage) - then you have done a really good job writing your unit tests and may actually be done. If a new person starts writing unit tests everyday throughout the project, and continually gets 80-99% code coverage with very few bugs - then there is either a problem with the developer-tester or you have perfect coders (less likely).

Code coverage is simply a tool - you can use it in your IDE or you can use it at build time, comparing the results of yourself (and your unit tests) with that of the rest of the team.

This whole notion of having a green bar at all times is out of control with some developers. These are the same people who never promote warnings to errors. They will write comments to prevent certain lines from being checked by their static analyzer just to get "clean code". The sad part is when managers start wanting to hire/promote these guys, while punishing the developers who show how shoddy their code is.

Larry Clapp said...

public void setHeight( float newHeight )
{ height = newHeight; }

It can't go wrong and can't change. Hmmm.

%s/height/something_else/

Oops.

"Constants aren't, variables won't." Things that "can't" go wrong often do, and things that "can't" change often do. When you assert that some code doesn't need unit tests because it can't go wrong and can't change, you also assert that no one that ever works on the project will ever make a simple typo, will never hit the delete key in the wrong place, and (in the parlance of Common Lisp) will never, say, put an "around" method on the setter and forget to call call-next-method.

Ryan Ginstrom said...

I agree with reasons others have given for 100% coverage (hitting every line, not just calling every method): TDD, documentation, safety net when refactoring, and (especially important in my case) preventing stupid errors.

There is a problem with focusing on 100% coverage, though: thinking that you're done just because you've got it. I think that's a much bigger problem than wasted time unit testing getters and setters.

You've got to also test the edge cases, cases that intuition and experience tell you are likely to cause errors, etc.

Andrew Binstock said...

@stephan: You wrote: "With 100% code coverage all your decisions are covered." No, this is not correct, and is another fallacy I'll have to blog on. See David's post (right above yours) for a slightly different sentiment that clarifies the problems with your assertion.

@mr. neighborly: Agreed that TDD is a different matter. Although 100% code coverage is not a goal of TDD that I know of, rather it's a side effect.

@larry: not sure I agree. If you see unit tests as a way of preventing future stupid errors, then sure I'm cool with what you propose. For myself, I find it hard enough to write code that works correctly and is clear. Worrying about building in safeguards for possible errors that might or might not occur down the road is a near-endless task, and investing in 100% CC seems like an extravagant price to pay a priori. I'd vote for spending the time refactoring and documenting before writing unit tests for getters/setters.

@dre, @ryan: thanks for thoughtful responses with different takes.

Stephan said...

@Andrew: I'm confused because I thought cyclomatic complexity (CC) does represent all possible paths through the code. So with 100% CC coverage you have 100% path coverage.

Peace
-stephan

Rich said...

Personally I believe the greatest value of using a Code Coverage tool is not to determine if your code is 100% tested (in fact on larger, older projects I rarely see 100%) but to determine what areas of code are consistantly not covered.

For example, the greatest area of untested code I see when introducing a code coverage tool on a project is Exception Handling. If the developer can see this and learn omissions in their testing then their skill set increases.

Also, I think another related conversation/argument is choosing between writing your own tests and using a tool that generates them for you....oops! Sorry Andrew, looks like another can of worms opening.. ;-)
-Rich

Ryan Allen said...

That's not true, the code snippet:

public void setHeight( float newHeight )
{ height = newHeight; }

The behaviour is that height is being set to newHeight. I don't know if the Java compiler would detect this (I work exclusively in dynamic languages):

public void setHeight( float newHeight )
{ height = newHeightLOLCATS; }

If the compiler doesn't pick that up, then you have a broken test case that you haven't tested for.

Andrew Binstock said...

@stephan: Sorry, in your original reply, I took your "100% CC" to mean 100% code coverage, not as you meant it: "100% cyclomatic complexity," which is a term I'm not familiar with. Indeed, if you cover every logical branch, you're going to cover a lot of your code. But surely many paths will require more than one test, while some paths will hardly be worth the time to write a test--where you draw the line reflects the main point of my post; that getting every last line tested (such as methods where CC = 1) is not an inherently desirable goal. There are better things to do to improve your code than getting every bit tested.

By the way, here is a post on this blog relating to the minimum number of tests to use in relation to the cyclomatic complexity of a method. In retrospect, I think it presents too low a minimum.

David Dossot said...

Though I totally agree that 100% code coverage is a fallacy, I think it is good to aim for it not for the sake of it but just as a driver to improve the code.

In that matter, let me quote you: "It appeals to me because almost invariably when I refactor code to make it more testable, I find that I've improved it".

Tony said...

stephan said: "Beside that I believe it's easier to keep your coverage at 100% than to keep it at 80%."

I agree with that as an important reason to push for 100% coverage. If my setter / getter isn't working, i'd like my unit test to tell me that.

My approach is that I write my code test first, continue developing till all tests pass, then I check my coverage and it is interesting to me that often my code coverage tool picks up an edge case I had forgotten to test but had coded.

By having 100% coverage, these missing tests jump out much more than if they were just another untested line amongst a whole bunch of untested setters / getters.

I've written more here:

http://homepage.mac.com/hey.you/lessons.html

Tony

CodeMonkey said...

I'm getting annoyed with alot of developers who use this as an excuse not to improve or measure their unit tests and code coverage.

100% coverage, will give you more confidence in your system that 30%, but it's important to understand that 100% code coverage, doesn't mean 100% of scenarios are tested i.e. a while loop that explodes after the nth execution wouldn't necessarily be caught by a test that achieved 100% code coverage.

i.e. strive for the best code coverage you can get, but don't believe this makes your system bug free.

There are of course reasons why you probably should never get 100% coverage as writing tests for getters/setters or some other generated code doesn't always make sense.

bloggist said...

Umm, what does "culling" mean?