Review and refactor

Arseni Mourzenko

Founder and lead developer

179

articles

July 16, 2018

SoftwareEngineering.SE is a good place for interesting questions. One of the recent questions about unit tests was particularly interesting: not for its unit testing part, but for giving a small piece of code which may seem particularly straightforward and linear, but still could have been refactored in order to mitigate the risk of errors.

Let's try to answer the question and take this opportunity to do a deep review and correct the concerned piece of code. The original code is:

public static string SHA256(string plainText)
{
    StringBuilder sb = new StringBuilder();

    SHA256CryptoServiceProvider provider = new SHA256CryptoServiceProvider();
    var hashedBytes = provider.ComputeHash(Encoding.UTF8.GetBytes(plainText));

    for (int i = 0; i < hashedBytes.Length; i++)
    {
        sb.Append(hashedBytes[i].ToString("x2").ToLower());
    }

    return sb.ToString();
}

Through this article, we well progressively modify this piece of code in order to test it, as well as simplify it and move the burden of doing the actual work to .NET Framework itself.

This article refers to the SVN revisions. You can get the repository dump (15 KB) in order to track yourself the changes.

If you don't know how to restore SVN dumps, you may visit a wiki about it.

What unit tests do we need?

Personally, I'll write only three tests, and mitigate the need of writing other ones by modifying the code itself. The tests I'll write are ensuring the method behaves as expected:

For a null string,
For an empty string,
For a sample string.

A more formal approach would be to follow the NIST standard. In my opinion, for this project, it is an overkill. It would be different if we were reimplementing SHA256 generation without using the one available in .NET Framework.

Note that the fact that I don't use TDD for this project affects the choice of tests. For example, if I were writing tests first, without knowing which would be the actual implementation, I would have tested different locales, Unicode strings, etc. Currently, given the implementation, I know that I don't need those tests.

■ See revision 2 containing both the method and the three unit tests.

What about input checking?

The unit test ComputeSHA256OfNullString is not testing our method. The fact that the method never checks the input, passing it all around to .NET Framework means that the logic covered by this test is missing from the method. Simply it happened that GetBytes is throwing the exception we want when the input is null. This means that:

If .NET Framework's GetBytes implementation changes, the test may fail, even if our method remains unchanged. This creates a rather ugly dependency.
Regression testing will weird if one day the method is modified in a way that it doesn't call GetBytes: it wouldn't be obvious why the change caused the method to not throw the expected exception any longer.
There is a risk for somebody who have read the unit tests but not studied the code itself before modifying it to use the input before GetBytes is called, while relying on the fact that the SHA256() method is not expected to run if the text is null.

Let's cover the null case ourselves. By including the checking of input for null, we make ourselves independent of other methods, and make the code easier to understand and safer for future changes.

■ See revision 3 for inputs checking.

Are variable names explicit enough?

The original code is well formatted, following Microsoft's style conventions. On the other hand, there are improvements to do for naming conventions.

Short names are good, but understandable names are even better. sb or SHA256 are not explicit enough. builder or ComputeSHA256Hash are slightly better (even if they are still not the best ones we can imagine).

■ See revision 4 where variable names were renamed.

■ See also: Code Complete, Second edition · Chapter 11: The Power of Variable Names

Is the code correct?

Now that cosmetic changes were made, it's finally time to study to code itself.

It's pretty straightforward. There is:

A guard clause which checks the input,
A linear set of calls to constructors and methods of .NET Framework,
A loop.

The first thing is to check the code with Code analysis. Surprise, the static analysis of the code reveals two important problems:

The object which implements IDisposable is not properly disposed.
The methods which are culture-dependent are called without specifying the culture.

It seems appropriate to make a note here: for cases such as this one, technically, the original developer may make assumptions explicit or let the next developer try to guess what was intended. Letting others to guess your intentions is always bad in programming. If the intention was to use the current culture, mark it. Don't rely on the defaults.

Since the original code doesn't specify the intent, let's guess it. The method is computing a hash, converting a set of bytes to their hexadecimal representation. Seems like a good candidate for an invariant culture. And still, since none is specified, it's the default, i.e. current culture, which will be used. Too bad.

Such errors are particularly ugly. They are often impossible to catch with a compiler, and are sometimes invisible to static analysis. Moreover, unit tests don't reveal them neither. A piece of code can spend years doing a different thing that it was expected to do, and nobody will see that, until, some day, it fails, probably after a change which wouldn't reveal anything to regression testing.

One way to solve this issue is to specify the invariant culture. Another way is to continue thinking about the original intention of the developer and what the code actually does.

someByte
    .ToString("x2", CultureInfo.InvariantCulture)
    .ToLower(CultureInfo.InvariantCulture)

is not doing what the author expected it to do.

The author probably expected to obtain a lowercase hexadecimal representation of a byte.
The actual code is obtaining a lowercase of a lowercase hexadecimal representation of a byte.

Indeed, the text is converted to lowercase, while originally, it already is lowercased, given the "x2".

Conclusion: the case when the code is doing something different from what was expected and when the result is still the same as if it was implemented correctly is one of the ugliest cases which may happen. Since compilation, testing and static analysis are often unable to catch the difference, the code is assumed correct, until the error reveals itself at the moment when nobody expects it.

To avoid such errors, don't purely rely on unit tests. As you can see, static checking and an informal code review — such as the one done in this article, helped finding bugs which were difficult or impossible to detect with unit tests.

Unit testing is just one of the ways to find defects, and none of the techniques can achieve defect-detection rates above 75%.

■ See also: Code complete, Second edition, page 470 (Chapter 20: The Software-Quality Landscape)

The method ends with a for. As I previously explained on SoftwareEngineering.SE, for and foreach are two completely different beasts: foreach is closer to a while than a for.

The usage of for in the current case is problematic for two reasons:

Semantically, it should have been a foreach, not a for. We are looping through all the elements of a sequence, so for is not appropriate.
for is error prone. It has an additional variable, it starts by a magic value and ends when another magic value is reached. If it is modified inside (for example the value of i is altered), bad things may happen.

Use for when appropriate (that is at most 2% of the cases in business-oriented applications). Use foreach when iterating through every element of a sequence.

■ See revision 5, 6 and 8 for the changes related to culture, redundant code and the migration from for to a foreach.

Do we really need three unit tests?

The less code we have to test, the better. It appears that with .NET Framework 4, there is a way to change the method in order to make one of the three unit tests useless.

By using Code contracts instead of a manual check of inputs, we mitigate the parameters validation from our code to .NET (not counting the benefit in terms of static analysis).

■ See revision 7 and 9 where the guard clause was replaced by a Code contract, allowing to remove one of the unit tests while reducing complexity, and where some other Code contracts were added for additional reliability.

■ See also: C# In Depth by Jon Skeet, Second edition · Chapter 15 Letting your code speak more clearly with Code Contracts

for, foreach, nothing.

We previously replaced a for by a foreach. An additional step is to remove the foreach. C# has LINQ, which is a powerful way to work with sequences while using paradigms inspired by functional languages.

Given that the hash is nothing more than a sequence of bytes and the result—a string—a sequence of characters, LINQ is a good choice when it comes to transforming one sequence to another.

■ See revision 12 where LINQ reduces the complexity and removes the foreach.

Wait, aren't we slow now?

In revision 12, we moved from a loop to a LINQ chaining, which also led us to stop using StringBuilder, which is known as being a much faster alternative to a concatenation when it comes to appending lots of strings.

Performance is an important aspect of programming, but in practice, it is often overrated and neglected at the same time. Programmers may spent hours arguing which one of the implementations is faster, when the difference between them is at most a few microseconds, and those same programmers will write code which could have been optimized very easily.

I won't explain in detail how expected performance should be determined, how actual one should be measured, and how optimization should be done. Let's simply note that:

Premature optimization is a terrible thing.
Only a profiler is able to determine the bottleneck: developers trying to guess where the bottleneck is are nearly always wrong.

The actual piece of code is a perfect example. Knowing that StringBuilder has a reputation of being fast, one can assume that the revision 12 deteriorated the performance of the code, and StringBuilder should be introduced again to obtain a decent speed.

Running a profiler, it appears that, as usual, I was wrong. The bottleneck is not the new string(...) or the LINQ expression, but the constructor of SHA256 service provider (see Fig. 1).

Fig 1. The original bottleneck

ZIP file with VSP file (1.1 MB)

Indeed, creating a crypto service provider may not be an easy task, while creating a string composed of 64 characters can be done pretty fast. The primary optimization to do is to call the crypto service provider only once.

At this point, we are hit by the fact that until now, the utility class as well as the method itself were static. Static methods are good for certain tasks, but this is not one of them, especially since the generation of SHA256 hashes can be so easily modeled in an object oriented way.

Moving from static objects to instances makes it possible to initialize the crypto service provider only once, and then call the computation method multiple times. This simple change drops the time required to compute the hash by 2.7 times (see Fig. 2).

Fig 2. The optimization by a factor of 2.7.

ZIP file with VSP file (6.8 MB)

■ See revision 14 which abandons static classes and methods, allowing optimizing the method to multiply the performance by 2.7.

We can do better. With a tiny modification consisting of moving from values computed on the fly to a table, another optimization by a factor of 2 becomes possible (see Fig. 3).

■ Also see: Code complete, Second edition · Chapter 18: Table-Driven Methods

Fig 3. An additional optimization by a factor of 2.

ZIP file with VSP file (10.0 MB)

There are probably many other optimizations we can do. Some will come at a cost of readability, others will waste more memory, others will have nearly no negative impact. The fact is that the methods looks pretty fast to me now, since on my machine, it performs above my expectations. When doing profiling and optimization, know when to stop. Spending days in order to gain a few milliseconds of execution time is not worth the effort (see Fig. 4).

Fig 4. The actual performance matches the requirements.

ZIP file with VSP file (2.8 MB)

What do we get?

It's time to study the result of the refactoring. What do we get? Is it actually better than the original version?

Here is a comparison between the original and the latest versions, i.e. the revision 2 and the revision 18:

Metric	Revision 2	Revision 18
Code coverage of the library ^{higher is better}	100%	100%
Code coverage of unit tests ^{higher is better}	87.5%	100%
Number of Code contracts assertions ^{higher is better}	14	789
Number of unknown Code contracts assertions ^{lower is better}	1	0
Maintainability index of the library ^{higher is better}	65	68
Maintainability index of unit tests ^{higher is better}	83	77
Cyclomatic complexity of the library ^{lower is better}	2	8
Cyclomatic complexity of unit tests ^{lower is better}	4	5
Depth of inheritance of the library ^{lower is better}	1	1
Depth of inheritance of unit tests ^{lower is better}	1	1
Class coupling of the library ^{lower is better}	4	8
Class coupling of unit tests ^{lower is better}	5	4
ILLOC of the library ^{lower is better}	4	8
ILLOC of unit tests ^{lower is better}	8	11
Code analysis violations ^{lower is better}	5	2

Based on those metrics, it appears that the modification is not really worth it. It improved some metrics, but deteriorated others. What those metrics don't show is that the new code:

Solves a few bugs,
Is slightly easier to read,
Makes further refactoring less risky.

Every of those three points makes it totally worth the effort.