Linters rock, but they are slow

Arseni Mourzenko
Founder and lead developer
March 27, 2018
Tags: code-style 3 quality 36 productivity 36

I love lin­ters. I love them so much that I'm ac­tu­al­ly con­sid­er­ing to use some of them at pre-com­mit stage to re­ject com­mits which con­tain er­rors. The only thing which is the rea­son I hes­i­tate to do it is that some of them are quite slow.

Speed mat­ters

Cur­rent­ly, my serv­er-side pre-com­mit process is the fol­low­ing:

For ex­am­ple, the log mes­sage should end by a dot; en­forc­ing this helped me in sit­u­a­tions when I pressed En­ter by mis­take af­ter en­ter­ing only a part of the mes­sage, such as svn ci -m "Implemented syntax highlighting for live previews using ": since I don't re­mem­ber what I was us­ing and need to check for the ex­act name, there is a risk, when com­ing back to the ter­mi­nal, to do the com­mit as-is in­stead of con­tin­u­ing to write the mes­sage.

An­oth­er ex­am­ple is that the log mes­sage can­not be the same as one of the five pre­vi­ous mes­sages, so if re­vi­sion 2044 is “Made it pos­si­ble to mea­sure the length of the sto­ries.” and at re­vi­sion 2047, I'm do­ing a svn ci -m "Made it possible to measure the length of the stories.", the pre-com­mit hook will re­ject it. Same log­ic: if the mes­sage is the same, I've prob­a­bly just went up the his­to­ry in the ter­mi­nal and pressed En­ter in­ad­ver­tent­ly. I do it all the time, by the way, giv­en that I of­ten use a sep­a­rate ter­mi­nal win­dow for SVN com­mits and some­thing else (such as run­ning the Node.js ap­pli­ca­tion).

Some have not­ed that it's strange to not let the com­mit log mes­sage end by an ex­cla­ma­tion point or an el­lip­sis. This is be­cause I don't want such mes­sages in my repos­i­to­ry. Ex­cla­ma­tion points are a sign that the mes­sage is not neu­tral, de­scrip­tive and for­mal; if the per­son do­ing the com­mit is too ex­cit­ed, he shouldn't do the com­mit in the first place. El­lip­sis are an­oth­er in­di­ca­tion that some­thing is wrong with the mes­sage. A mes­sage should be com­plete, self-suf­fi­cient. It should tell every­thing the read­er needs to know. In this con­text, el­lip­sis doesn't make sense.

Pro­to­types are giv­en a spe­cial treat­ment here: the check­er skips di­rec­to­ries named “[P|p]ro­to­type”.

That's every­thing which hap­pens here. Repli­ca­tion to two oth­er SVN servers and back­up strat­e­gy is out of scope, be­cause it hap­pens af­ter the com­mit, not be­fore.

One of the goals to keep it ba­sic is to en­sure com­mits are ex­treme­ly fast. Speed is im­por­tant: if com­mits are slow, de­vel­op­ers will be en­cour­aged to skip op­por­tu­ni­ties to com­mit and do fat com­mits in­stead. What may not be ob­vi­ous is that by ex­treme­ly fast, I don't mean that a com­mit which takes in av­er­age five sec­onds is slow; what I mean is that a 900 ms. com­mit is al­ready too slow. It is slow be­cause the per­son has to wait, i.e. re­main pas­sive. When the de­vel­op­er writes code, does a re­view, adds con­struc­tive com­ments, does refac­tor­ing, checks the changes made since the last com­mit or writes a de­scrip­tive log mes­sage, he spends much more than 900 ms., but he is ac­tive, so it doesn't mat­ter. Be­ing pas­sive for a few hun­dreds of mil­lisec­onds, on the oth­er hand, is re­al­ly an­noy­ing, and the vari­abil­i­ty of the de­lay, i.e. the fact that you don't even know how long would it take makes things only worse.

But lin­ters are nice to have

De­spite the time it takes, lin­ters are nice to have dur­ing a pre-com­mit stage, be­cause they have an out­stand­ing val­ue. I stopped count­ing cas­es where lin­ters saved me hours of painful de­bug­ging. This is ex­pect­ed, since at the op­po­site of a style check­er, which cares only about style, lin­ters check for sus­pi­cious things which can re­sult in bugs.

The log­ic be­hind set­ting them at pre-com­mit lev­el is the same as for style: de­vel­op­ers (in­clud­ing my­self) are too lazy to check for things on reg­u­lar ba­sis. If I'm not forced to write code com­pli­ant with style rules, I won't, no mat­ter how im­por­tant I think style is. I just won't check for er­rors reg­u­lar­ly enough, and then, when I check for it a week lat­er, the num­ber of er­rors to cor­rect will be just too over­whelm­ing.

In the same way, when work­ing with C#, I don't run Code analy­sis reg­u­lar­ly enough un­less it is run when pro­ject is com­pil­ing. But then, the same speed prob­lem oc­curs: it is too slow (one sec­ond for a small pro­ject, much more for larg­er ones), so I find my­self com­pil­ing less fre­quent­ly. When Code analy­sis doesn't run au­to­mat­i­cal­ly, I might run it a few times a week, but then I find my­self spend­ing hours de­bug­ging, while I could have found the er­ror with­in sec­onds with Code analy­sis.

Of course, some lin­ters, such as JS­Lint, are ex­treme­ly fast. Par­tial­ly, this is be­cause they don't do too much pow­er­ful stuff. Google Clo­sure Com­pil­er, for ex­am­ple, goes fur­ther, but gosh, its speed is too ter­ri­ble.

Let CI serv­er call them

The only way I can see is to move lin­ters to the same lev­el as sys­tem and func­tion­al tests (both types be­ing of­ten very slow).

But this cre­ates an ad­di­tion­al prob­lem which is solved in big-scale pro­jects, but not in small com­pa­nies: how to in­form the de­vel­op­ers that their code con­tains er­rors?

This means that for small pro­jects, run­ning lin­ters with­in the CI work­flow doesn't re­al­ly bring the ben­e­fit of near­ly real-time in­for­ma­tion about the po­ten­tial prob­lems with the code.

Tests, in this way, are some­how dif­fer­ent. If I break a few tests, this doesn't mean the code I com­mit­ted is wrong. This doesn't mean I in­ad­ver­tent­ly did some­thing I shouldn't. This doesn't mean any­thing. I may have con­scious­ly broke tests, be­cause I know that they'll pass again this evening, or maybe I just don't care about those tests, since they deal with lega­cy code. Re­gres­sion test­ing, while use­ful, leads too of­ten to the changes in tests, not in code. On the oth­er hand, if the lin­ter tells that I'm wrong, there are chances that I re­al­ly am wrong.

I'm not sure if I can push the dif­fer­ence fur­ther, but I have also an im­pres­sion that I need to know that a lin­ter found my code wrong much faster than I need for tests. Two rea­sons:

  1. Es­sen­tial­ly, it comes to the im­por­tance of be­ing no­ti­fied very fre­quent­ly of all style er­rors. This helps cor­rect­ing those er­rors un­ob­tru­sive­ly: if I'm stuck af­ter a hard day of work with one hun­dred er­rors, this is much more ob­tru­sive than if I have to cor­rect five er­rors every thir­ty min­utes.

  2. Hav­ing an im­me­di­ate feed­back on pos­si­ble bugs help re­duc­ing de­bug­ging. Un­for­tu­nate­ly, I don't have hard data, but my very sub­jec­tive im­pres­sion is that many bugs found by lin­ters are de­bugged im­me­di­ate­ly af­ter the code is writ­ten, and many bugs de­bugged im­me­di­ate­ly af­ter the code is writ­ten can be found by lin­ters.

Re­gres­sion test­ing, on the oth­er hand, doesn't help much when I need to de­bug the new­ly writ­ten code—it comes handy when new­ly writ­ten code breaks ex­is­tent func­tion­al­i­ty some­where else.

This makes lin­ters spe­cial. On one hand, they have the same im­por­tance as style check­ers and must have im­me­di­ate feed­back. On the oth­er hand, they are slow, and per­for­mance-wise, are clos­er to sys­tem and func­tion­al tests.

Where would you put them? How would you de­sign the no­ti­fi­ca­tion mech­a­nism?

*1* Of course, I'm talking about ordinary projects. Prototypes shouldn't enforce any given style, because in a prototype, we can't care less. Rapid development projects, i.e. projects where the priority is speed, are not an exception, by the way: for such projects, you still have to enforce style, because this helps to reduce time wasted later when reading and maintaining the code. The difference with a prototype is that prototypes are usually small, and their code will be neither read a lot, nor maintained. Code of a rapid development project will not be thrown few weeks later.