Sunday, December 28, 2008

Code Coverage Metrics

Code coverage reports are great, I love the information they give me. I also love the idea of failing a build if you code coverage metrics drop below a certain point. But I think it's generally accepted that code coverage numbers are can be very misleading. A low % of lines coverage is certainly bad, but a high % of lines covered doesn't necessarily mean you've done a good job either. You could have a bunch of tests that really don't exercise many edge cases, but they hit all the lines of code. It doesn't really mean that its been tested well.

I find this sort of thing happens with integration tests. One medium sized integration test, could "cover" lots and lots of code, I could remove a number of unit tests and still have the same coverage % because I have a lot of integration tests. These days I'm not as interested in the coverage % (okay I still want close to 100) but I'm really interested to know if I run Emma (for example) on FooTest, is Foo 100% covered? In my ideal world each Test would cover its related class 100%. I find the Emma plugin for eclipse really helpful to do that kind of analysis. And I'd love a tool that would give me that kind of report.

Sadly that tool doesn't exist. In the current world coverage metrics are great, but they leave something to be desired. After discovering the moreunit plugin I've realized how a tool like that could help enhance coverage metrics. I want to know for every public method in Foo is there a corresponding test method in FooTest. If you had this kind of metric in combination with code coverage % this could put a confidence value on how good your code coverage is. Sadly that tool doesn't exist either.

But what would be even greater than two tools that don't exist a third tool that combines the two. I would love to know that my FooTest.bar*() methods give me 100% coverage on Foo.bar() method. Having something like that would give me very high confidence in my code coverage metrics. I'm guessing as things move along in code quality metrics we'll start seeing tools like that being developed.

One issue with the moreunit tool is that to do it's analysis it requires test method naming conventions, that in my eyes seem to be in continual development. Google around a little bit... lots of people argue for really long names, junit3 required test to start off the test method names, http://blog.jayfields.com/2008/05/testing-value-of-test-names.html kind of believes there should be no test method names, I personally like to shun the java standard camel cased method names and go more of the ruby route and use underscores (my co-workers don't like that at all). In any case, it's clear (at least to me) that test naming is difficult. But the idea that you could get some really valuable reporting out of standardized test method names seems like good reason to embrace standardization (at least a little bit). I'm personally ready to start preceding all my test method names with "test" just to get some of the ad-hoc metrics that moreunit offers. I think once more tools come out that start assuming testing conventions we'll start to get even more value and flexibility out of our test code.

Update - it turns out that hacking the plugin wasn't too hard, I now have the plugin recognizing method names like "foo()" instead of "testFoo()". I'm not going to consider changing my method names after all!

2 comments:

niick said...
This comment has been removed by the author.
niick said...

Nice post!

Large, heavy-weight (integration) tests certainly dilute coverage metrics. In practice however, such tests do add value to a test suite. Often, they probably should be excluded from a coverage report and its metrics.

You should have a look at Clover2. It allows you to view the code coverage of a file for any specific set of unit tests. For example, have a look at:
Header.java
(Click the "Show Tests" button to toggle which tests to view the coverage for)

You can also view exactly which Classes a specific testMethod hit.
e.g. testWriteResponseFileIOExceptionHandling
shows that testWriteResponseFileIOExceptionHandling hit exactly three classes, with BufferingAppender having the most coverage attributed.

Cheers,
Nick Pellow
http://atlassian.com/clover/

 
Web Statistics