Reviewing code through dialogue

Arseni Mourzenko
Founder and lead developer
176
articles
November 5, 2021
Tags: communication 27 code-review 2

Re­cent­ly, I tried a new tech­nique which mix­es pair pro­gram­ming and code re­view. I don't know if this tech­nique has a name, and I don't even know if it is used by some­body else—sure, I haven't seen it else­where.

My col­league and I picked a piece of code he worked on. As he's a ju­nior pro­gram­mer, un­fa­mil­iar with many of the de­tails of C# and .NET Frame­work, there were two goals in what I did. First, I want­ed to elu­ci­date the mis­con­cep­tions and mis­un­der­stand­ings he has with the tech­nolo­gies he tries to use. Sec­ond, and much more im­por­tant, I want­ed to show him what ex­act­ly should he put fo­cus on, how should he think, when he codes, what ques­tions should he ask to him­self. I mis­er­ably failed at both points; per­haps be­cause it was new for me, like­ly too be­cause he wasn't ready for this sort of things. Nev­er­the­less, I be­lieve that the tech­nique can be very ef­fec­tive in some cas­es—it would take me some time to iden­ti­fy, what cas­es ex­act­ly, and what are the lim­its.

Il­lus­tra­tion

It goes like this. We take one line of code—not nec­es­sar­i­ly the first one; it may be any­thing any­where in the new­ly writ­ten code—and then we talk about it. Say, we take this line of code:

// Counting opportunities which are active.
int counter = this.Opportunities
    .Where(opportunity => opportunity.Enabled)
    .ToList()
    .Count();

When I see this, I have some ques­tions which pop in my head. So I start with the first one, and ask my col­league:

— What do you think ToList() is do­ing?
— No idea.
— Do you know the dif­fer­ence be­tween col­lec­tions and se­quences in C#?
— I don't think so.
— Do you know about lazy eval­u­a­tion?
— I do, I use it all the time in Python, you know, the square brack­ets and the paren­the­sis...
— Ex­act­ly. So you do have the same thing in C#, but it is a bit im­plic­it. There are se­quences, which are based on an it­er­a­tor, and there are col­lec­tions, which have their el­e­ments stored in-mem­o­ry. Some op­er­a­tions in LINQ ex­pres­sions are lazy and re­turn a se­quence; oth­ers eval­u­ate the un­der­ly­ing se­quence, and re­turn a col­lec­tion. Oth­ers, fi­nal­ly, eval­u­ate a se­quence and re­turn some­thing else—a num­ber, a boolean, what­ev­er. The ToList() re­turns a col­lec­tion. Do we re­al­ly need that?
— I imag­ine that we don't.
— Don't imag­ine. If you think we don't need it, just ex­plain why.
— Be­cause it would store the el­e­ments in mem­o­ry, just in or­der to count them lat­er on.
— Ex­act­ly. As ToList() is di­rect­ly fol­lowed by Count(), there is ab­solute­ly no need to do that, and it wastes mem­o­ry, and CPU cy­cles. Count() can sim­ply it­er­ate through the el­e­ments of the se­quence, and get to the same re­sult faster, with less mem­o­ry be­ing used. Let's change our code.

// Counting opportunities which are active.
int counter = this.Opportunities.Where(opportunity => opportunity.Enabled).Count();

— Okay, so you have a Where() clause fol­lowed by a Count(). Do you see how we could com­pact this?
— I don't know.
— Do you know the over­loads of Count()?
— Hm...
— Is the term “over­load” clear?
— Yes, of course.
— What does it mean?
— It means that a func­tion can have op­tion­al ar­gu­ments.
— Not ar­gu­ments, but pa­ra­me­ters; and it's not about op­tion­al pa­ra­me­ters per se, but rather about the abil­i­ty to have mul­ti­ple meth­ods that share the same name, but dif­fer by their pa­ra­me­ters, al­low­ing the com­pil­er to fig­ure out which one of the over­loads to call at a giv­en mo­ment. So, how do you know what are the avail­able over­loads of a method in C#?
— I sup­pose I can get this from the doc­u­men­ta­tion.
— You sup­pose?
— I can get this from the doc­u­men­ta­tion.
— Please, do.
— Let's search on the in­ter­net...
— Is there an eas­i­er way?
— Hm...
— You have your Vi­su­al Stu­dio opened. Wouldn't it show you the over­loads?
— Oh, sure. Let's check. So, there is a pa­ra­me­ter­less one, and an­oth­er one which seems to ask for... a pred­i­cate?
Seems? You don't look sure. It is be­cause you don't know what a pred­i­cate is?
— Yeah, the term is a bit un­clear.
— A pred­i­cate is a con­di­tion which needs to be ful­filled. In LINQ, the term pred­i­cate is used to de­note that only the el­e­ments of a se­quence which do match a spec­i­fied con­di­tion will pass to the next step of the ex­pres­sion—or be part of the fi­nal re­sult. When you used Where() in the ex­pres­sion, you ac­tu­al­ly spec­i­fied a pred­i­cate. Now, it hap­pens that Count(), too, ac­cepts a pred­i­cate. Do you re­mem­ber what was our orig­i­nal ques­tion we were asked our­selves?
— I don't think so...
— It was whether the ex­pres­sion can be sim­pli­fied or not. Do you see a way now?
— I could put the pred­i­cate into the Count()...
— And?
— And get rid of Where()?
— Is this a ques­tion?
— And I could get rid of Where().
— Ter­rif­ic. Let's do it.

// Counting opportunities which are active.
int counter = this.Opportunities.Count(opportunity => opportunity.Enabled);

— Does opportunity look ap­peal­ing to you?
— It has a mer­it of be­ing clear.
— Sure it is. My per­son­al taste is to use short names for nar­row scopes. Es­sen­tial­ly, here, I would go with an ugly and un­clear x, with its sole ben­e­fit of be­ing re­al­ly short. But it's up to you to pick the best so­lu­tion.
— I think it's a good idea to use short names. Let's go with x.

// Counting opportunities which are active.
int counter = this.Opportunities.Count(x => x.Enabled);

— Nice. What's the type of the re­turn val­ue of Count()?
— It's the num­ber of op­por­tu­ni­ties marked as en­abled.
— My ques­tion is rather about the type...
— Right. An in­te­ger, I would imag­ine?
— How could you check whether your be­lief is cor­rect or not?
— I don't know...
— Vi­su­al Stu­dio could pro­vide some help. What hap­pens if you move your cur­sor right on Count()?
— Oh, I see, it's an in­te­ger.
— In this case, you don't need an ex­plic­it int, and can use var, that the com­pil­er would re­place by int any­way.

// Counting opportunities which are active.
var counter = this.Opportunities.Count(x => x.Enabled);

— Love­ly. Do you think the com­ment is valu­able?
— Of course.
— Please, con­tin­ue.
— Well, it tells what the code is do­ing.
— Is this the goal of a com­ment?
— What would it be?
— Well, let's think to­geth­er about it. If a com­ment re­peats what the code is do­ing, what would you learn about the code when read­ing it? Af­ter all, you could sim­ply read the ac­tu­al code.
— Why should a com­ment teach me some­thing?
— If it doesn't, what is its pur­pose?
— To help me un­der­stand the code.
To help you un­der­stand the code. In­ter­est­ing. Is this the case where the com­ment helps you to un­der­stand the code?
— Well... I'm afraid, it doesn't. Let's re­move it.
— Wait, wait, wait. First things first, let's read it again. “Count­ing op­por­tu­ni­ties which are ac­tive.” Ac­tive? Re­al­ly? It is not ex­act­ly what the code is do­ing, as it checks for the op­por­tu­ni­ties which are en­abled, not ac­tive. What's the dif­fer­ence?
— En­abled op­por­tu­ni­ties are the ones which are ac­tive in the GUI.
— So it's the term “ac­tive” which is be­ing used in the user in­ter­face?
— Yes, the busi­ness an­a­lysts asked us to use this pre­cise term, as it is more ex­plic­it to the users.
— And to us?
— What to us?
— Is it more ex­plic­it to us, de­vel­op­ers?
— I think it sure is, as “en­abled” refers to whether the op­por­tu­ni­ty is avail­able or not, where­as “ac­tive” means that it is pos­si­ble to use it at this giv­en mo­ment—at least this is how the busi­ness an­a­lysts, and I, un­der­stand it.
— Hon­est­ly, the dis­tinc­tion is un­clear for me, but if it is clear for you, and if you think that “ac­tive” is the cor­rect term here, let's re­name the prop­er­ty. And now that we did it, the com­ment can be re­moved, as it be­came tru­ly re­dun­dant.

var counter = this.Opportunities.Count(x => x.Active);

Analy­sis

As you can see, sev­er­al it­er­a­tions were need­ed in or­der to go from a piece of code which has quite a lot of is­sues, to a piece of code which is near­ly per­fect—I leave to you the job of de­cid­ing whether counter is a cor­rect name for a vari­able or not. Any­way, the ex­am­ple shows not only how the code can be im­proved, but more im­por­tant­ly how the knowl­edge and the be­liefs of a per­son can be chal­lenged, while giv­ing a learn­ing op­por­tu­ni­ty. It is not just about know­ing things: the es­sen­tial part here is to train the per­son to work in a giv­en way, to lever­age the tools he has, to know what to check, to un­der­stand the code he writes.

An or­di­nary code re­view would be short­er, but much less ef­fec­tive. Nev­er use ToList() be­fore Count(). Don't put ex­plic­it types when they match the ones which would be de­duced by the com­pil­er. Use short names in tiny scopes. Those are state­ments that do give an op­por­tu­ni­ty to learn, but not that much; more­over, they don't even try ex­plor­ing the lim­its of knowl­edge of the au­thor of the piece of code un­der re­view.

Dur­ing the hir­ing in­ter­views, I of­ten launch a dis­cus­sion, even on ques­tions where a sim­ple an­swer is re­quired. Even a nasty:

What is the val­ue of n in var n = Enumerable.Range(1, 10).Count(x => x % 2 == 0);?

Could make it pos­si­ble to talk for quite a long time about the ex­plic­it­ness of the pa­ra­me­ters (right, here, it's not about talk­ing, but more about rant­i­ng, as this is a per­fect ex­am­ple where a con­fu­sion is re­al­ly easy to do, which would make one be­lieve that the re­turn val­ue is 4, rather than 5), the dif­fer­ent operands avail­able in C#, the mag­nif­i­cent beau­ty of LINQ with its method chain­ing, the way method chain­ing is im­ple­ment­ed un­der the hood, the im­por­tance of ex­ten­sion meth­ods, the dan­gers of cus­tom ex­ten­sion meth­ods on com­mon types and in­ter­faces, such as int or IEnumerable<T>, and dozens and dozens of oth­er sub­jects which could pro­gres­sive­ly be de­rived from a sim­ple piece of code. And, ac­tu­al­ly, it's those dis­cus­sions which al­ways al­lowed me to have a clear un­der­stand­ing of the lev­el of a giv­en pro­gram­mer, rather than the ac­tu­al an­swers.

I do be­lieve that the tech­nique I de­scribed here could be used in lieu of or­di­nary train­ing cours­es, as soon as the en­vi­ron­ment al­lows it. It could be a valu­able ap­proach in men­tor­ing.

For now, one thing that I find un­for­tu­nate about this tech­nique is how de­mand­ing it is to­wards the ju­nior pro­gram­mer. It's its psy­cho­log­i­cal im­pact which should be tak­en in ac­count, as not every­one can ac­cept that the lim­its of his knowl­edge would be brought up so clear­ly. I think that it could be very de­mor­al­iz­ing for pro­gram­mers who lack self-con­fi­dence, or who sim­ply don't have a good enough re­la­tion with the more skill­ful de­vel­op­er.

An­oth­er draw­back, which is in­her­ent to my­self and the per­sons who have the ten­den­cy to work in the same way I do, is the at­ten­tion to the ex­act words, and a fur­ther analy­sis of those words. Its ef­fect can be com­pared to the ef­fect of psy­cho­analy­sis, where every­thing you say could be scru­ti­nized, and you may not ex­act­ly be will­ing to think about the ex­act words you use. This cre­ates a sort of in­com­pat­i­bil­i­ty with the per­sons who don't think too much about the words they use, as well as the per­sons who are in gen­er­al not very af­fir­ma­tive, not con­vinced about the things they say. If one is not care­ful, this could quick­ly shift the com­mu­ni­ca­tion from the ac­tu­al code to the way things are get­ting for­mu­lat­ed. While be­ing in­ter­est­ing from a lin­guis­tic per­spec­tive, this has no val­ue—and no real place when work­ing on code, and only dis­tracts both per­sons from the sub­ject. With the col­league, I have found my­self in such sit­u­a­tion dur­ing our ses­sion. The fact is, my col­league has this habit of as­sert­ing some­thing, while im­me­di­ate­ly back­ing up.

— Was Mon­day's is­sue fixed in the last re­lease?
— Sure it was, ab­solute­ly. Well, I think so. I'm not sure, maybe we for­got.

In some cir­cum­stances, this is ex­treme­ly an­noy­ing and dan­ger­ous. When you're in a mid­dle of a cri­sis be­cause the prod­uct is not work­ing any longer, and the busi­ness is los­ing mon­ey, it is es­sen­tial to be fac­tu­al. It's re­al­ly not about what some­body thinks of be­lieves. Ei­ther you as­sert some­thing you can ver­i­fy—or that you know for sure—or you don't. When it's the code which is be­ing re­viewed, com­ment­ed, and mod­i­fied, im­pre­cise lan­guage is pos­si­bly less of an is­sue, as it's not an is­sue to call a method a “func­tion,” or to mix pa­ra­me­ters and ar­gu­ments—no need to be over­ly pedan­tic.

This be­ing said, the ex­act for­mu­la­tions and words be­ing used by a ju­nior pro­gram­mer do still mat­ter, as it could lead to a dis­cov­ery of some mis­un­der­stand­ings, that would be dif­fi­cult to spot oth­er­wise. For now, it re­mains un­clear how one would do that, while avoid­ing to ap­pear pedan­tic, or just an un­lik­able per­son.