Build complexity

Arseni Mourzenko
Founder and lead developer
June 24, 2020
Tags: rant 34 productivity 36

I'm al­ways amazed at the ca­pac­i­ty of large and small com­pa­nies to pick the tools which would make their build as com­plex as pos­si­ble. It's like they en­joy pay­ing mon­ey for some­thing which will cost them even more mon­ey. The core prob­lem here is the mis­un­der­stand­ing of what a build ac­tu­al­ly is, and what is its goal.

When a de­vel­op­er adds a fea­ture to a prod­uct or fix­es a bug, he may want the change he just per­formed to be de­ployed, that is, to find its way to the ma­chines which run the ac­tu­al code. Those ma­chines may be any­thing: cor­po­rate servers if we are deal­ing with in­tranet sites, or em­bed­ded hard­ware if we are talk­ing about soft­ware con­trol­ling a mis­sile or a nav­i­ga­tion sys­tem of a sub­ma­rine, or smart­phones if the ap­pli­ca­tion is a pop­u­lar game for An­droid. You get the pic­ture, I think.

For all but the tini­est pro­jects, an in­di­vid­ual de­vel­op­er can­not de­ploy the changes by hand. A man­u­al de­ploy­ment would in­volve for the de­vel­op­er to SSH on a serv­er, or some­how walk to every sub­ma­rine which needs the up­grade, or to be in con­trol of every An­droid de­vice where the game is in­stalled. Not only this is ei­ther im­pos­si­ble or too cum­ber­some, but it is also er­ror prone: the pro­ce­dure may have a few steps, in­volv­ing start­ing and stop­ping ser­vices, copy­ing files, en­sur­ing en­vi­ron­ment vari­ables are set, set­ting new ones, check­ing the free space on disk, etc., and with more steps, the risk of get­ting it wrong in­creas­es. For this rea­son, the de­ploy­ment is au­to­mat­ed, and the de­vel­op­er is ex­pect­ed to in­ter­vene as few times as pos­si­ble, and ide­al­ly once: by per­form­ing the com­mit.

From the mo­ment when the de­vel­op­er com­mits the changes, all sorts of things start to hap­pen. A bunch of tools start by check­ing whether the code match­es the style rules, that is a bunch of rules which dic­tate how the code should look like (it's like Hart's rules, but for pro­gram­mers). Then there are sta­t­ic check­ers: they try to rec­og­nize pat­terns which in the past fre­quent­ly led to bugs. Then there are tests, all sort of tests. Those tests are check­ing for re­gres­sions. Some are ex­treme­ly ba­sic—call that piece of code, and check that its an­swer is this or that—and some are ex­treme­ly com­plex, in­volv­ing in­ter­ac­tions from a few dozens of ma­chines, with sta­tis­ti­cal analy­sis of the re­sults, and oth­er cool stuff. Once a spe­cif­ic part of those tests is fin­ished, the code is ready to go to pro­duc­tion. If this is, for in­stance, a web ap­pli­ca­tion which is host­ed on two hun­dred servers (for re­dun­dan­cy and scal­a­bil­i­ty rea­sons), one would need to ac­tu­al­ly take down those two hund­ed servers one by one, and re­build them with the new change. Dur­ing the de­ploy­ment, oth­er tests are ran, which en­sures that if some­thing goes wrong, there is time to re­vert the change with­out af­fect­ing the en­tire pro­duc­tion. All those steps are nec­es­sar­i­ly au­to­mat­ed.

Based on this de­scrip­tion, you may imag­ine that au­tomat­ing all that shouldn't be an easy task; there­fore, it would make sense to use some prod­uct which does all that for us. In prac­tice, it shouldn't hap­pen this way. A soft­ware prod­uct nev­er starts huge: it starts small, and if it is suc­cess­ful, it grows (and oc­ca­sion­nal­ly shrinks, but not by much). When the pro­ject is start­ed, many com­pa­nies tend to throw a bunch of tools which, at that stage, make the build much more com­pli­cat­ed than it needs to be. Imag­ine a new web ap­pli­ca­tion: it's brand new, and there­fore small, even tiny. There are a bunch of very ba­sic fea­tures, and be­cause it's in its ear­ly stage, it can be host­ed on a Rasp­ber­ry Pi, be­cause there are vir­tu­al­ly no users yet. How dif­fi­cult would it be to script the build for such ap­pli­ca­tion? Well, pret­ty easy: a Bash script con­tain­ing about twen­ty lines of code could do the job.

Now, the first so­lu­tion is to do ex­act­ly that: a sim­ple Bash script which does the whole thing. If de­vel­op­ers do that, they know ex­act­ly what they are do­ing. It would more­over take a few min­utes to write the ac­tu­al script.

Then, there is a sec­ond so­lu­tion: pick Jenk­ins or Team­C­i­ty and spend a few hours set­ting it up and con­fig­ur­ing it prop­er­ly. A few hours may feel like not a big deal if it saves months of work lat­er on, but would it, ac­tu­al­ly?

The next time the build needs to be changed, if the build is a sim­ple Bash script, the change can be done in a mat­ter of min­utes. With Jenk­ins or Team­C­i­ty, things may not be that fast. I re­cent­ly per­son­nal­ly wast­ed four days try­ing to fig­ure out how to change one lit­tle thing in a build of a cor­po­rate pro­ject. The pro­ject was mi­grat­ed from SVN to Git, and some­thing didn't go well. In­stead of help­ing me to fig­ure out what is the prob­lem, Team­C­i­ty ac­tive­ly hid it un­der the lay­ers of ab­strac­tion, pro­vid­ing ver­bose but use­less logs, and no hint as to which one of the two hun­dred pos­si­ble set­tings should be changed.

Things could have been much smoother if the com­pa­ny had a per­son who is very skill­ful in Team­C­i­ty. Alas, there was no such per­son. So in­stead of a valu­able tool, we had added com­plex­i­ty, added cost, and... well, no ben­e­fits what­so­ev­er.

Usu­al­ly, the pro­po­nents of those tools tell you that scripts are text-based, tools are brows­er-based, and every­one likes any­thing which runs in a brows­er as much as every­one hates com­put­er ter­mi­nals. Then, they would ex­plain that a graph­i­cal in­ter­face makes it much eas­i­er to de­bug prob­lems, to view re­ports, etc.

This is not true. There are cas­es where graph­i­cal in­ter­faces are su­pe­ri­or. If I need to cre­ate a 3D mod­el of some­thing, I would prob­a­bly use a CAD tool rather than write the co­or­di­nates in a text file, and if I need to ad­just a pho­to­graph, I would en­joy the pow­er of a graph­i­cal ed­i­tor, in­stead of telling in the ter­mi­nal that the im­age should be sharp­ened by 1.2 and then cropped with the bound­ing box (455, 0, 1621, 920).

How­ev­er, there are cas­es where graph­i­cal in­ter­faces is just fluff. What ex­act­ly Team­C­i­ty or Jenk­ins brings what can­not be shown in a con­sole?

There is an­oth­er is­sue which is rarely men­tioned: by tak­ing the build con­fig­u­ra­tion from the de­vel­op­ers and putting it in a third-par­ty sys­tem, one los­es the pow­er of the ver­sion con­trol. In gen­er­al, the build sys­tems have their own ver­sion con­trol, but that's the prob­lem: it's their own, apart from the ac­tu­al source. This cre­ates a se­ries of prob­lems, one of which be­ing the bro­ken flow. Imag­ine I fix a bug, which in­volves fix­ing the tests, which in turn in­volves ad­just­ing the build con­fig­u­ra­tion. Log­i­cal­ly, this should be a sin­gle com­mit. In re­al­i­ty, there would be one com­mit, and then some changes in the third-par­ty sys­tem, tracked by this sys­tem only. Imag­ine now that on the next day, the busi­ness de­cides that the bug isn't a bug, af­ter all. Sup­pose I'm on va­ca­tion, and my col­league goes and re­verts my com­mit from the pre­vi­ous day. How is he ex­pect­ed to know that I also changed the build? Only by see­ing the build bro­ken, and spend­ing a great deal of time search­ing why. Such a waste of time!

There is a point in hav­ing a build tool. I be­lieve that it can be very help­ful for (1) medi­um or large pro­jects (2) us­ing stan­dard ar­chi­tec­ture, with (3) many teams work­ing to­geth­er, in a case where (4) there is a per­son who knows ex­act­ly how to con­fig­ure the tool. This should give a pro­duc­tiv­i­ty boost, and could make it eas­i­er com­pared to home-grown scripts. Al­though, I haven't seen such cas­es in prac­tice. In prac­tice, I see a mis­used tool which was used “by de­fault” or as if it was the only op­tion. It adds com­plex­i­ty, makes in­for­ma­tion more dif­fi­cult to use, and costs mon­ey through the whole life of a pro­ject, and this is sim­ply not right.