Using LOCs to validate hypotheses

Arseni Mourzenko
Founder and lead developer
176
articles
May 21, 2021
Tags: productivity 36 quality 36

In my pre­vi­ous ar­ti­cle, I was talk­ing about the tool which gath­ers the diffs from the ver­sion con­trol com­mits, and uses them to com­pute the num­ber of lines of code (LOC) per lan­guage over time, in or­der to be able to ex­tract some cool stuff from the data.

For fun, I de­cid­ed to run the tool on a code base of a client. There, I was work­ing for the past year and a half in a team of two, and so I gath­ered the met­rics both for my­self, but also for my col­league, Nico­las. Al­though I would be very skep­ti­cal about com­par­ing two per­sons through the LOC met­ric, there were still a few in­ter­est­ing things I've dis­cov­ered and a few hy­pothe­ses I con­firmed or re­fut­ed.

Work from home im­pact

Hy­poth­e­sis: since we start­ed to work from home, I com­mit­ted more code than be­fore, be­cause of the bet­ter work­ing con­di­tions; Nico­las com­mit­ted less code, be­cause he has a small child at home.

Hy­poth­e­sis re­fut­ed.

I see ab­solute­ly no dif­fer­ence, vi­su­al­ly, be­tween the pe­ri­od where we were work­ing at the of­fice, and the pe­ri­od where we worked from home. I was, in fact, tru­ly con­vinced that I would see, for me, an in­crease of at least 1.5 be­tween those pe­ri­ods, but it is im­pos­si­ble to as­sert that there is even a small in­crease. I would imag­ine that ei­ther the log­a­rith­mic scale can­not rep­re­sent cor­rect­ly such dif­fer­ences, or, more like­ly, that the num­ber of LOCs ac­tu­al­ly re­mained the same.

Client side, serv­er side

Hy­poth­e­sis: I work most­ly on serv­er side, Nico­las—on client side. He'll out­per­form me in terms of Type­Script LOCs, and I'll out­per­form him for C# part.

Hy­poth­e­sis re­fut­ed.

By com­par­ing our LOCs for spe­cif­ic lan­guages, specif­i­cal­ly C#, LESS, SQL, and Type­Script, he out­per­forms me in the num­ber of SQL lines added 1, but not in oth­er cat­e­gories. He gets the ex­pect­ed 24% 2 for added C# lines (i.e. for every four lines of C# code I add, he adds one line), but only 61% 3 for added Type­Script.

Cat­e­go­ry Me Him Per­cent
+C# 44,705 10,913 2 24%
−C# 40,697 1,846 4 5%
+LESS 8,507 2,706 32%
−LESS 7,700 143 2%
+SQL 13,489 14,193 1 105%
−SQL 13,216 12,171 92%
+Type­Script 18,879 11,506 3 61%
−Type­Script 17,853 1,881 11%
+Every­thing 6 93,250 39,318 42%
Every­thing 82,754 16,041 5 19%

Table 1 Com­par­ing LOCs be­tween two con­trib­u­tors. The every­thing line is not ex­act­ly the sum of the pre­vi­ous lines, as it also in­cludes the lan­guages such as Bash or Python which weren't list­ed in the table.

Keep­ing code base small

Hy­poth­e­sis: a quite wor­ri­some as­pect is that Nico­las doesn't refac­tor his code. I was imag­in­ing that I would have near­ly as much lines re­moved as added, but he would have a very low score on re­moved lines.

Hy­poth­e­sis con­firmed.

This is clear­ly vis­i­ble in Table 1, where I'm re­mov­ing twen­ty times as much C# code as he do 4, and even more for LESS, no pun in­tend­ed. Over­all, 5 he doesn't make enough ef­fort clean­ing stuff up. This could be sus­tain­able for now, since I re­move more lines that he adds (82,754 ver­sus 39,318), but if he finds him­self all alone on this pro­ject, or with an­oth­er col­league who also doesn't refac­tor enough, the code base would grow much faster than it does now.

The num­ber of lines I re­move com­pare to the lines I add rep­re­sents 89%, which is not bad. More im­por­tant­ly, it is 91% for C#, 98% for SQL, and 95% for Type­Script. Ide­al­ly, I think that for a lega­cy code base, the val­ue should be above 100% (in oth­er words, there should be more lines re­moved than added), so I should try to im­prove that.

I write most­ly C#

Hy­poth­e­sis: prob­a­bly half of the lines I add should be C# code, as I don't do a lot of client side pro­gram­ming, and I don't do much SQL ei­ther.

Hy­poth­e­sis con­firmed.

In­deed, 44,705 lines of C# code added rep­re­sent 48% of all lines of code I add. For the re­movals, it gets up to 49%—a per­fect half. I'm not sur­prised ei­ther to dis­cov­er that I don't work on LESS code, but SQL met­rics are quite sur­pris­ing: so sur­pris­ing, ac­tu­al­ly, that I sus­pect a flaw, such as some gen­er­at­ed code be­ing in­clud­ed by mis­take in the met­rics.

Pay­ing enough at­ten­tion to per­son­al pro­jects

Hy­poth­e­sis: over the pe­ri­od I was work­ing on the pro­ject of this cus­tomer, I also re­mained ac­tive on my own pro­jects, hav­ing a sim­i­lar LOC/year ra­tio for both.

Hy­poth­e­sis con­firmed.

Ac­cord­ing to Table 1, I had added 6 93,250 and re­moved 82,754 LOCs. Those met­rics cor­re­spond to a time span go­ing from Feb­ru­ary 2020 to May 2021. For my per­son­al pro­jects, if I iso­late the sec­tion which goes from the 1st of Feb­ru­ary 2020 to May 18th, 2021, the met­rics are +86901, −31864. Sure, I haven't re­moved too much lines, but the first num­ber seems to in­di­cate that I'm quite ac­tive on my own pro­jects.

Con­clu­sion

The sta­tis­tics based on LOCs are not only lim­it­ed, but in­her­ent­ly dan­ger­ous. For every met­ric, one has to ask him­self is it the best met­ric? Is it even the cor­rect one? How would it be gamed? What could be the neg­a­tive ef­fect of mea­sur­ing it? But LOCs are par­tic­u­lar in that they are re­al­ly easy to mea­sure, and, his­tor­i­cal­ly, were used to do ter­ri­ble things, be­cause they are so tempt­ing and so easy to mis­use.

A par­tic­u­lar care should be tak­en when com­par­ing two per­sons from the per­spec­tive of a met­ric which re­lies on LOCs. The thing is re­al­ly dan­ger­ous, be­cause it en­cour­ages you to be­lieve things that are not true, and make de­ci­sions that would be flawed. It can be as ba­sic and stu­pid as say­ing that the per­son who pro­duces twice as much LOCs as his col­league should be payed twice as much, but it can also be much more sub­tle and con­vo­lut­ed.

There are, how­ev­er, valid cas­es where LOCs can be ap­plied. For in­stance, one could use it to have a hint at how fast the code base grows, in or­der to re­act be­fore it's too late. Or one can tar­get a spe­cif­ic re­moved vs. added ra­tio in or­der to en­cour­age pro­gram­mers to refac­tor their code. In my case, I used the met­ric to do sev­er­al things:

Those five il­lus­tra­tions show how LOCs en­riched with ad­di­tion­al in­for­ma­tion about the lan­guage or the au­thor can be used ef­fec­tive­ly to ex­tract some use­ful in­for­ma­tion from a ver­sion con­trol. It ex­tends the ob­ser­va­tions I did in my pre­vi­ous ar­ti­cle to a lev­el of mul­ti­ple repos­i­to­ries, with mul­ti­ple per­sons work­ing on the code base. They al­low to con­firm or re­fute some of the hy­pothe­ses I had in a non-am­bigu­ous way, and al­though the val­ue of those met­rics is lim­it­ed, they can be used ef­fec­tive­ly in lots of cas­es.