Home Home

Build complexity

Arseni Mourzenko
Founder and lead developer, specializing in developer productivity and code quality
146
articles
June 23, 2020

I'm always amazed at the capacity of large and small companies to pick the tools which would make their build as complex as possible. It's like they enjoy paying money for something which will cost them even more money. The core problem here is the misunderstanding of what a build actually is, and what is its goal.

When a developer adds a feature to a product or fixes a bug, he may want the change he just performed to be deployed, that is, to find its way to the machines which run the actual code. Those machines may be anything: corporate servers if we are dealing with intranet sites, or embedded hardware if we are talking about software controlling a missile or a navigation system of a submarine, or smartphones if the application is a popular game for Android. You get the picture, I think.

For all but the tiniest projects, an individual developer cannot deploy the changes by hand. A manual deployment would involve for the developer to SSH on a server, or somehow walk to every submarine which needs the upgrade, or to be in control of every Android device where the game is installed. Not only this is either impossible or too cumbersome, but it is also error prone: the procedure may have a few steps, involving starting and stopping services, copying files, ensuring environment variables are set, setting new ones, checking the free space on disk, etc., and with more steps, the risk of getting it wrong increases. For this reason, the deployment is automated, and the developer is expected to intervene as few times as possible, and ideally once: by performing the commit.

From the moment when the developer commits the changes, all sorts of things start to happen. A bunch of tools start by checking whether the code matches the style rules, that is a bunch of rules which dictate how the code should look like (it's like Hart's rules, but for programmers). Then there are static checkers: they try to recognize patterns which in the past frequently led to bugs. Then there are tests, all sort of tests. Those tests are checking for regressions. Some are extremely basic—call that piece of code, and check that its answer is this or that—and some are extremely complex, involving interactions from a few dozens of machines, with statistical analysis of the results, and other cool stuff. Once a specific part of those tests is finished, the code is ready to go to production. If this is, for instance, a web application which is hosted on two hundred servers (for redundancy and scalability reasons), one would need to actually take down those two hunded servers one by one, and rebuild them with the new change. During the deployment, other tests are ran, which ensures that if something goes wrong, there is time to revert the change without affecting the entire production. All those steps are necessarily automated.

Based on this description, you may imagine that automating all that shouldn't be an easy task; therefore, it would make sense to use some product which does all that for us. In practice, it shouldn't happen this way. A software product never starts huge: it starts small, and if it is successful, it grows (and occasionnally shrinks, but not by much). When the project is started, many companies tend to throw a bunch of tools which, at that stage, make the build much more complicated than it needs to be. Imagine a new web application: it's brand new, and therefore small, even tiny. There are a bunch of very basic features, and because it's in its early stage, it can be hosted on a Raspberry Pi, because there are virtually no users yet. How difficult would it be to script the build for such application? Well, pretty easy: a Bash script containing about twenty lines of code could do the job.

Now, the first solution is to do exactly that: a simple Bash script which does the whole thing. If developers do that, they know exactly what they are doing. It would moreover take a few minutes to write the actual script.

Then, there is a second solution: pick Jenkins or TeamCity and spend a few hours setting it up and configuring it properly. A few hours may feel like not a big deal if it saves months of work later on, but would it, actually?

The next time the build needs to be changed, if the build is a simple Bash script, the change can be done in a matter of minutes. With Jenkins or TeamCity, things may not be that fast. I recently personnally wasted four days trying to figure out how to change one little thing in a build of a corporate project. The project was migrated from SVN to Git, and something didn't go well. Instead of helping me to figure out what is the problem, TeamCity actively hid it under the layers of abstraction, providing verbose but useless logs, and no hint as to which one of the two hundred possible settings should be changed.

Things could have been much smoother if the company had a person who is very skillful in TeamCity. Alas, there was no such person. So instead of a valuable tool, we had added complexity, added cost, and... well, no benefits whatsoever.

Usually, the proponents of those tools tell you that scripts are text-based, tools are browser-based, and everyone likes anything which runs in a browser as much as everyone hates computer terminals. Then, they would explain that a graphical interface makes it much easier to debug problems, to view reports, etc.

This is not true. There are cases where graphical interfaces are superior. If I need to create a 3D model of something, I would probably use a CAD tool rather than write the coordinates in a text file, and if I need to adjust a photograph, I would enjoy the power of a graphical editor, instead of telling in the terminal that the image should be sharpened by 1.2 and then cropped with the bounding box (455, 0, 1621, 920).

However, there are cases where graphical interfaces is just fluff. What exactly TeamCity or Jenkins brings what cannot be shown in a console?

  • The build logs? This is by far the worst thing it can do. Not only does it often freezes the browser, but there is no grep, no diff | view -, and no other tools that I can use to work with text. When I have to compare the logs from two consecutive builds, I have to manually download them, and only then I can work with them; why would anyone do such a thing? Instead of all the power I get when working with text, all I have is basic search; no regular expressions allowed. Flop.
  • Or maybe the list of builds? This can be shown in a console as well, and should be much faster anyway.
  • Or the list of failed tests? Here again, a console version makes more sense, and developers are more accustomed to the console version, as this is how test reports are usually shown when the tests are executed on their machines.

There is another issue which is rarely mentioned: by taking the build configuration from the developers and putting it in a third-party system, one loses the power of the version control. In general, the build systems have their own version control, but that's the problem: it's their own, apart from the actual source. This creates a series of problems, one of which being the broken flow. Imagine I fix a bug, which involves fixing the tests, which in turn involves adjusting the build configuration. Logically, this should be a single commit. In reality, there would be one commit, and then some changes in the third-party system, tracked by this system only. Imagine now that on the next day, the business decides that the bug isn't a bug, after all. Suppose I'm on vacation, and my colleague goes and reverts my commit from the previous day. How is he expected to know that I also changed the build? Only by seeing the build broken, and spending a great deal of time searching why. Such a waste of time!

There is a point in having a build tool. I believe that it can be very helpful for (1) medium or large projects (2) using standard architecture, with (3) many teams working together, in a case where (4) there is a person who knows exactly how to configure the tool. This should give a productivity boost, and could make it easier compared to home-grown scripts. Although, I haven't seen such cases in practice. In practice, I see a misused tool which was used “by default” or as if it was the only option. It adds complexity, makes information more difficult to use, and costs money through the whole life of a project, and this is simply not right.