Home Home Posts Rants about IT DevOps Stuff I'm working on

Utility classes are wrong

Arseni Mourzenko
Founder and lead developer, specializing in developer productivity and code quality
107
articles
January 20, 2017

Everyone has somewhere on a PC a directory called “miscellaneous”, or simply “misc.” This is where we put orphaned stuff which haven't found its place somewhere else, grouped with similar files in a well-named directory. For some people, the desktop directory is the actual location for miscellaneous stuff.

In a similar manner, most projects I've seen developed over time a tumor and gave it an ugly name “utility.” What happen is that the project grows organically, and the architecture and overall code organization doesn't necessarily keep up, either because the team doesn't have time for refactoring, or because nobody has enough knowledge of the overall project structure, or simply because no one cares. Discrepancies appear, and sooner or later, someone creates a class which doesn't fit well anywhere. So the class end up in a namespace or package which has “utility” in its name.

The second encounter of the “miscellaneous”-type tumor doesn't have a single project as its scope, but either a bunch of projects or, potentially, all the projects of a company. What happens is that someone, somewhere, needs for a project to use some code which was already written for a different project within the same company. With all the good intentions, the developer, instead of duplicating code, makes a shared library or package and, to highlight the reusable aspect of the new library, calls it “utility.”

In the following two sections, I'll explain why both usages of “utility” word are problematic by themselves and indicative of a larger issue.

“Utility” as a part of a project

When earlier in this article I was talking about the “miscellaneous” directory, I said nothing about the reasons which lead to the creation of such directory in the first place. The mutually exclusive aspect of directories coupled with a flawed tree structure are the major reasons we end up with the mess. Given the rigidity of file system tree structures and the fact that an established structure don't evolve easily, it is not surprising that many files cannot find their way in the hierarchy.

Code source as well takes a form of a limited-depth typed-levels tree structure. Libraries contain packages. Packages contain classes. Classes contain methods. As with the file structure, some classes and methods will fit perfectly well within the existent structure, while others will tend to become marginals.

When the classes or methods which don't find their way in the current code structure are labeled as “utility,” this simply means that the team was unable to adapt the structure to the newcomer. As I said at the beginning of this article, there may be multiple reasons for that, but independently of the reason, the presence of such classes or methods is indicative of a legacy structure which wasn't refactored over time. This shouldn't happen in a correctly maintained project.

Another possibility is that the thing which was labeled as utility was actually put in the right place within the code hierarchy, but just incorrectly named. This, indeed, happens a lot. There are two reasons for that:

  • Finding good names is difficult. While, indeed, this is not an easy task, this is not a reason to use generic terms such as “utility” or “business.” All application code is business code. All application code is utility code. Please, don't use terms which bring no meaning.

  • There seem to be a magical meaning to the word “utility.”

    Some people seem to use it as an opposition to “business,” in a sense that all code which has some exclusivity to the project is business code, while all code which could be found in any project would be utility code. This distinction makes me particularly nervous: the border between the two is all but clear. For instance, would a rule used to generate an invoice based on the specific characteristics of airplane parts be characterized as a business rule because it is specific to the internal purchasing platform used by Boeing? Great, now what if Airbus has the exact same rule in a different project? The same goes for so-called utility code. Is code which sends e-mail utility code? Would we really need this code on a shrink-wrap software for biologists which processes DNA samples? Or maybe on embedded software which controls the automated trains in a subway?

    There are other perverted uses of this word. Many developers believe that utility code means static classes with functions in them. Not only the term “utility” is still meaningless in this context, but the whole idea is wrong in a context of an object-oriented language, given how those so-called utility functions are implemented, and leads to inflexible and unfriendly code.

“Utility” as a library shared across projects

Sharing code between projects prevents duplication, which is a Good Thing™ in any situation. And so, companies create utility libraries which grow, and grow, and grow over time. Their growth is usually horizontal; in other words, new components are added to a collection of old ones. All components being relatively independent—there is really few things in common between serializing an object to JSON, logging an event, reading a file and creating a transaction over a network—the term “utility” is often used by the lack of a better alternative (personally, I prefer “Ali Baba,” but since I recently explained that cultural references have no place in our code, I'm afraid that the term won't stick well within the community.)

The problem here is that as the library grows, it becomes more and more cumbersome. Instead of explaining the concept itself, I will barely give two examples from two ecosystems.

  • Imagine if all Python's code you use was in a single package. You need to parse CSV files? Yep, import utility is what you do. Need to work with dates? That would be import utility. Anyone wants to work with regular expressions? That's clearly utility code.

  • Imagine if all .NET Framework, as well as every assembly created by Microsoft and the most popular NuGet packages, were in the same assembly. You need to print “Hello World” to a console? Here's the utility assembly, it can do that; do not pay attention to all those SharePoint classes. What is here? It's MEF, you don't really need it to print text to console, but it could be handy to have it if you need it one day. As well as Bootstrap-related code; handy too.

This mess is exactly what companies tend to produce when they create the “utility” project for everything which is used by two or more projects.

The solution? Create separate packages or libraries. If two projects share code which produces EDI invoices, create an EDIFACT package for that. Name it EDIFACT. The next time when you create code which generates secure passwords based on some fancy criteria, create a separate package for that and name it accordingly. Don't mix it with EDI under the “utility” umbrella, because developers who need to generate invoices may not necessarily need to generate passwords, and developers who work with passwords may never hear about EDIFACT.

The ability to run private npm, pypi or NuGet repositories means that you don't even need to handle dependencies yourself, even for projects which are not open sourced. So for your own's good, please, don't put together things which have nothing to do: keep them separate, and reuse the ones you need to, when you need to.