How metrics impact the way we work

Arseni Mourzenko
Founder and lead developer
December 8, 2020

A long time ago, I start­ed study­ing how qual­i­ty and pro­duc­tiv­i­ty can be mea­sured. I wrote ar­ti­cles about it, and I con­vinced a few dozen en­tre­pre­neurs about the im­por­tance of mea­sure­ment in their re­spec­tive com­pa­nies. In most of the cas­es, the dif­fi­cult part was not to ex­plain that mea­sure­ment has an im­pact, but to make them un­der­stand how to mea­sure, and how mea­sure­ments in­flu­ence the sys­tem be­ing mea­sured.

One of the things which seems to be dif­fi­cult to grasp is that when an em­ploy­ee knows that the com­pa­ny is mea­sur­ing a giv­en fac­tor, he will op­ti­mize his work to per­form bet­ter on this spe­cif­ic fac­tor. And, nat­u­ral­ly, the oth­er fac­tors could de­cline. This means two things. One is that mea­sure­ment it­self is an ac­tive tool. You don't just probe a sys­tem. You are in­flu­enc­ing it. The sec­ond one, more sub­tle, is that the in­flu­ence may be good or bad, what­ev­er good and bad means. Ide­al­ly, mea­sure­ments should lead to bet­ter qual­i­ty and pro­duc­tiv­i­ty. But in prac­tice, they don't al­ways have this pos­i­tive ef­fect.

I'm not even talk­ing here about mea­sure­ments which are ob­vi­ous­ly bad. A clas­si­cal ex­am­ple is the one of the man­ag­er who pays his pro­gram­mers de­pend­ing on how much lines of code they write per month. Al­though, the fact that this met­ric was and still is quite pop­u­lar in some com­pa­nies in­di­cates ei­ther the com­plete dumb­ness of some man­agers, or more like­ly the fact that it's not that easy to see what's good and what's not so good when you are your­self part of the sys­tem.

In­stead, I talk about met­rics which seem to flow nat­u­ral­ly from very ob­vi­ous fac­tors, such as how much copies of the soft­ware the com­pa­ny can sell, or how lit­tle em­ploy­ees the com­pa­ny hires. In or­der to ful­ly grasp an im­pact of a giv­en mea­sure­ment, one needs not only to have a good un­der­stand­ing of the sys­tem, of the mea­sure­ment it­self, and of the sup­posed im­pact of the mea­sure­ment on the sys­tem. What's also cru­cial is to un­der­stand how the mea­sure­ment can be per­formed in or­der to ob­tain the re­sults which ac­tu­al­ly have a mean­ing.

Here's a prac­ti­cal ex­am­ple of such dif­fi­cul­ty, which was re­cent­ly en­coun­tered by a col­league. The orig­i­nal dif­fi­cul­ty he had in his team was the con­stant waste of time and en­er­gy be­cause of the merge con­flicts. Every pro­gram­mer with­in the team was work­ing on his own branch, and on reg­u­lar ba­sis, the branch was merged to the trunk. More of­ten than not, this merge re­sult­ed in con­flicts, be­cause giv­en the na­ture of the soft­ware prod­uct and of the team, very of­ten mul­ti­ple pro­gram­mers found them­selves work­ing on the same files, and re­struc­tur­ing the pro­ject or the team to avoid that wasn't a good idea.

When my col­league had enough of the com­plaints about how a bad merge wast­ed two hours of work, he dis­cov­ered that there were two ad­di­tion­al prob­lems with the way the ver­sion con­trol was used.

First, some pro­gram­mers mis­un­der­stood what a com­mit should con­tain, and were com­mit­ting spo­rad­i­cal­ly and writ­ing in­com­plete com­mit mes­sages such as “WIP” or “Fixed bug.” Some of them were con­sid­er­ing com­mits as a tool to save their changes by the end of the day, and ob­vi­ous­ly, mak­ing any mean­ing­ful com­mit mes­sage in such cas­es was prac­ti­cal­ly im­pos­si­ble. One guy had writ­ten “A bunch of changes to some files” as a com­mit mes­sage; quite il­lus­tra­tive, I think. All this made it chal­leng­ing for oth­ers to un­der­stand what ex­act­ly those pro­gram­mers were do­ing.

Sec­ond, some oth­er pro­gram­mers com­mit­ted ex­treme­ly rarely, some­times not mak­ing any com­mits for a week or more. Their com­mit mes­sages were clear­er, since they were sim­ply ref­er­enc­ing the fea­ture or the bug on which they were work­ing. How­ev­er, the mas­sive num­ber of changes in the com­mit made it very chal­leng­ing to merge lat­er as well.

The man­age­ment cre­at­ed three mea­sure­ments:

Three months lat­er, it was time to see how those met­rics in­flu­enced the pro­ject. It was clear that the num­ber of com­mits in­creased: ac­tu­al­ly, it was about sev­en times high­er now than back then. But the num­ber of com­mits is a met­ric, not a goal. In oth­er words, if one de­vel­op­er does twen­ty com­mits per day, and the oth­er one does only five, it doesn't au­to­mat­i­cal­ly mean that the first one pro­duces more, or is a greater de­vel­op­er, or is more pro­duc­tive, or more valu­able for the com­pa­ny. As a stand­alone met­ric, it doesn't mean any­thing.

The orig­i­nal goal was to re­duce the pain of merge con­flicts, and this is the ac­tu­al thing which mat­tered to the com­pa­ny, be­cause the time pro­gram­mers spend on merge con­flicts is the time they don't spend pro­duc­ing val­ue: in oth­er words, those merge con­flicts rep­re­sent a waste which should ei­ther be elim­i­nat­ed, or at least re­duced as much as pos­si­ble in or­der to in­crease pro­duc­tiv­i­ty.

But how do you mea­sure the time ac­tu­al­ly wast­ed han­dling merge con­flicts? My col­league came with the idea of ask­ing de­vel­op­ers them­selves.

Are you im­pact­ed less or more by merge con­flicts for the last three months?

The re­sponse was rather un­ex­pect­ed for him. Out of twelve pro­gram­mers, ten an­swered that they are im­pact­ed more now. One said he's im­pact­ed less. An­oth­er one couldn't an­swer, be­cause he joined the com­pa­ny three and a half months ago.

Such re­sult was es­pe­cial­ly un­ex­pect­ed in the con­text where my col­league him­self could see that there are no blam­ing and ar­gu­ing about a par­tic­u­lar­ly chal­leng­ing merge con­flict any longer in the of­fice.

When my col­league com­plained to me, I sug­gest­ed to ask a dif­fer­ent­ly phrased ques­tion the next day:

How painful are the merge con­flicts for the last three months?

This time, he got an ac­claim. All eleven pro­gram­mers were ex­press­ing their hap­pi­ness about the merges.

What hap­pened is that with more com­mits and more merges, pro­gram­mers got much more merge con­flicts. If pre­vi­ous­ly, they were hav­ing one or two merge con­flicts per week (and spend­ing per­haps hours try­ing to re­solve them), now they were get­ting tiny lit­tle con­flicts all the time. One pro­gram­mer, for in­stance, com­plained that the oth­er day, he got eight con­flicts in a row: he was fix­ing some nasty bug, which was dif­fi­cult to un­der­stand, while an­oth­er per­son was do­ing refac­tor­ing and com­mit­ting all the time. This cre­at­ed a sit­u­a­tion where the first per­son was get­ting a con­flict on every merge.

The ma­jor dif­fer­ence, how­ev­er, is that most of the time, those con­flicts were solved in a mat­ter of sec­onds. Look at the diff. Un­der­stand what's go­ing on. Check the right op­tion. En­joy. As easy as that.

This is why they an­swered this way to the first ques­tion: in­deed, they were im­pact­ed more, not less by the con­flicts now. En­coun­ter­ing a con­flict eight times in a row is in­deed some­thing one would re­mem­ber for a long time, and talk about it, be­cause it's fun­ny. How­ev­er, the pain of orig­i­nal, huge con­flicts, was gone, and so the change was deemed pos­i­tive for the com­pa­ny.