What targetting 100% code coverage taught me
I was recently in charge of drafting a new system for a customer. I took it as an opportunity to demonstrate some of the approaches that should lead to higher quality software, compared to what the customer was accustomed to. Among them was my target to stay at 100% coverage for both lines and branches during the entire development of the system.
I often heard a mantra that 100% code coverage is unreachable in practice. That is, the time you spend on tests is asymptotic—the more you close on the target, the more time you will waste. Essentially, if you have to spend as much time to go from 99 to 99.9% as you spent covering the first 99%, the cost becomes prohibitive, especially given that the benefit is close to zero. I also heard—and regularly told to others, as well—that there is code that doesn't deserve tests, such as the getters and setters of a data object.
On the other hand, TDD tells us that 100% coverage not only makes sense, but is pretty much the only viable value. After all, if you can't even test the code you write, why would you write it in the first place?
I did use TDD, but sparsely. While there were parts of the code which were pure logic, and so ideal candidates for TDD, most of the code was just gluing things around and not doing anything interesting to deserve TDD. Moreover, I wasn't always sure how I would design my code, and had found myself quite often throwing away code and redoing it. When I wasn't confident that I will keep the source code, I would wait before adding tests.
Observations
First of all, the claim that the time one spends on tests is asymptotic is false. There are, indeed, tricky cases. Cases where you have to scratch your head for several hours to find a way to test that one remaining branch. One example was a method that was relying on the following property to be tested: public string CurrentUser => httpContext.User?.Identity?.Name
. In ASP.NET Core, this is what gets the name of the current user, and there are three branches here, as the previous line of code is equivalent to:
public string CurrentUser
{
get
{
var user = httpContext.User;
if (user == null)
{
return null;
}
var identity = user.Identity;
if (identity == null)
{
return null;
}
return identity.Name;
}
}
Testing that the correct name is used is pretty much straightforward. Similarly, by creating an empty ClaimsPrincipal
, one could test the branch where the identity is null
. However, the very first branch—the one where the user itself is empty—is tricky. In fact, the implementation of DefaultHttpContext.User
automatically creates a user if it wasn't specified. This makes it necessary to create a custom mock based on an abstract HttpContext
, which would return null
when asked for a user, given that it also requires to setup a bunch of other things that happen to be used by the code being tested—in my case the request and response headers, and the body of the response.
While some of those cases are rather boring to handle, others help understanding the internals of .NET. One such thing is the cancellation of an IAsyncEnumerable
. I had a case which drove me crazy for a while, where the uncovered line was the closing curly brace:
if (token.IsCancellationRequested)
{
if (ShouldIndicateCancellation)
{
yield return -1;
} // This line was marked as uncovered.
yield break; // This line, however, was covered.
}
It appeared that the coverage was giving me a hint that there is something I don't know, or rather don't understand, about how enumerables behave. What happened, in fact, is that this code was called through another block, which was looping through the elements of the sequence, and checking whether the cancellation was requested. When this was the case, the code below would yield return -1
. The caller would simply stop the iteration, which would mean that, indeed, the closing brace would never be reached. As for the yield break
, it was another test—the one where ShouldIndicateCancellation
was set to false
, that would pass through the line, specifically because the yield return -1
would not execute.
Far from being an annoyance, such edge case increases the understanding of .NET. In my case, it helped down the row to avoid a bug that would otherwise be found only in production, where cancellations would not be logged properly. It's not just the testing itself, but the attempt to have a 100% coverage that helped catching the bug.
The assertion that there exists some logic which is “too basic” to deserve some tests is equally wrong. In fact, all data structures are covered automatically as soon as you test the logic which relies on those data structures. I mean, yes, it's not interesting to have dedicated tests for a struct Coordinate
which contains two properties, int X
and int Y
. But if you do have such structure in your code, it means that, somehow, somewhere, it is used. For instance, there may be code which computes a distance between two coordinates. As soon as you test this code—and you definitively should test it—the getters and setters of the two properties of the structure will be tested as well.
Conclusion
100% code coverage may be a tricky target for legacy code, especially code that was never tested, and therefore probably not designed to be tested in the first place. In this case, it could indeed be smarter to target some lower target: sometimes 95%, sometimes 90% or 80%. In some cases, even 20% would be difficult to reach.
For new code, however, it makes sense to target 100% line and branch coverage. The additional cost to cover the remaining 1% won't be the same as what was spent covering the first 99%, and the benefit would be a better understanding of the system or the underlying language or framework by the developers, and fewer bugs.
One have to be careful, however, to have on the team one or few developers who are both motivated and skillful enough that they will attempt to understand what's going on when a few lines are uncovered. Similarly, one should strongly discourage developers from marking pieces of code as ignored for the code coverage. While this is a valuable tool in good hands, it could quickly become a “quick fix” tool for somebody who is not interested in understanding what's going on, and who just wants to fix that annoying error that blocks the build until the given coverage is attained.
This also means that it doesn't seem reasonable to enforce 100% code coverage during the build. Here, I'm pretty much unsure what should be done. On one hand, I'm absolutely convinced that those things do belong in the build, and that if the thresholds are not defined, any team can quickly drift away from the target. On the other hand, blocking a build for hours or days because of an artificial—yes, artificial—problem doesn't seem to me as a good idea either. Possibly, an actual solution would vary from team to team.