Do we think enough about coding standards we use?

Arseni Mourzenko
Founder and lead developer
177
articles
October 16, 2015
Tags: quality 36 code-style 3

What is wrong about cod­ing stan­dards and “best prac­tices” in gen­er­al is that they throw the think­ing process away. This may be a good idea for man­u­al work­ers, but is quite prob­lem­at­ic for de­vel­op­ers.

Cod­ing stan­dards, for in­stance, usu­al­ly take a form of a set of manda­to­ry rules pushed down to the de­vel­op­ers by a man­ag­er. If those cod­ing stan­dards are writ­ten by a skilled per­son, every­thing goes right. If not, the doc­u­ment does more harm than good.

Let me start by ex­plain­ing a mod­el which sorts the rules you may find in a cod­ing stan­dards doc­u­ment by ob­jec­tiv­i­ty de­gree.

Some guide­lines are some­where in the mid­dle. For in­stance, putting com­ma at the end of the last line of an ar­ray ini­tial­iza­tion in C#, like this:

var numbers = new []
{
    182,
    54,
    317,
    220, // ← Noticed the comma here?
};

makes sense in a con­text of a com­mit: if a de­vel­op­er adds an el­e­ment, the com­mit's diff shows that only one line changed. If there is no com­ma, the diff will show that two lines changed, which is not ex­act­ly rep­re­sen­ta­tive of the ac­tu­al change.

Un­for­tu­nate­ly, some cod­ing stan­dards are writ­ten with­out in-depth knowl­edge of the do­main. This leads to rules which ei­ther don't make sense or make sense only in a lim­it­ed scope. While re­view­ing a .NET cod­ing stan­dards doc­u­ment of a com­pa­ny, I found a great ex­am­ple.

The doc­u­ment en­cour­ages the use StringBuilder. In­deed, many de­vel­op­ers for­get that do­ing thou­sands of con­cate­na­tions to an al­ready long string could lead to per­for­mance prob­lems through high mem­o­ry us­age, and StringBuilder is a great so­lu­tion. Un­for­tu­nate­ly, the state­ment in the doc­u­ment doesn't give any de­tails:

Use String­Builder in­stead of us­ing "+" on the String class.

Now this is prob­lem­at­ic. With­out pro­vid­ing enough ex­pla­na­tion and with­out spec­i­fy­ing the scope of the rule, the state­ment be­comes harm­ful. By blind­ly fol­low­ing it, one would use StringBuilder for sim­ple con­cate­na­tion of short strings. This leads to sev­er­al prob­lems.

C# com­pil­er is clever enough to op­ti­mize con­cate­na­tion in some cas­es, but in sim­i­lar cas­es, it won't be able to op­ti­mize much if StringBuilder is used. For in­stance, this piece of code:

const string Name = "Jeff";
var text = "Hello, " + Name + "!";
Console.WriteLine(text);

be­comes:

IL_0000:  ldstr       "Hello, Jeff!"
IL_0005:  call        System.Console.WriteLine
IL_000A:  ret         

while:

const string Name = "Jeff";
var text = new StringBuilder()
    .Append("Hello, ")
    .Append(Name)
    .Append("!")
    .ToString();
Console.WriteLine(text);

is com­piled into:

IL_0000:  newobj      System.Text.StringBuilder..ctor
IL_0005:  ldstr       "Hello, "
IL_000A:  call        System.Text.StringBuilder.Append
IL_000F:  ldstr       "Jeff"
IL_0014:  callvirt    System.Text.StringBuilder.Append
IL_0019:  ldstr       "!"
IL_001E:  callvirt    System.Text.StringBuilder.Append
IL_0023:  callvirt    System.Object.ToString
IL_0028:  call        System.Console.WriteLine
IL_002D:  ret    

Looks ugly, right? What both­ers me more is C# code. It be­came less read­able, and this alone is a good rea­son to avoid StringBuilder un­less there is an ac­tu­al proof, ob­tained through ac­tu­al pro­fil­ing, that string con­cate­na­tion is the ac­tu­al bot­tle­neck.

I cheat­ed. In most cas­es, con­cate­na­tion wouldn't in­volve just con­stants. What if we use a pa­ra­me­ter in­stead?

var text = "Hello, " + i + "!";
Console.WriteLine(text);

gives:

IL_0000:  ldstr       "Hello, "
IL_0005:  ldarg.1     
IL_0006:  box         System.Int32
IL_000B:  ldstr       "!"
IL_0010:  call        System.String.Concat
IL_0015:  call        System.Console.WriteLine
IL_001A:  ret  

while a sim­i­lar piece of code which uses StringBuilder looks a lot like the one which was con­cate­nat­ing con­stants, the only dif­fer­ence be­ing at IL_000F:

IL_0000:  newobj      System.Text.StringBuilder..ctor
IL_0005:  ldstr       "Hello, "
IL_000A:  call        System.Text.StringBuilder.Append
IL_000F:  ldarg.1
IL_0010:  callvirt    System.Text.StringBuilder.Append
IL_0015:  ldstr       "!"
IL_001A:  callvirt    System.Text.StringBuilder.Append
IL_001F:  callvirt    System.Object.ToString
IL_0024:  call        System.Console.WriteLine
IL_0029:  ret        

Per­for­mance suf­fers too, in­stead of im­prov­ing. StringBuilder vari­ant takes two to four more CPU cy­cles com­pared to the con­cate­na­tion. This is be­cause in the ba­sic case where strings are short and there are few strings to con­cate­nate, con­cate­na­tion is faster. Un­der the hood, the con­cate­na­tion op­er­a­tor “+” is trans­formed into string.Concat. If you look at the im­ple­men­ta­tion of string.Concat, you can eas­i­ly see why. The method com­putes the to­tal length of the re­sult­ing string, and then calls C++'s FastAllocateString. Not bad. In­stead of recre­at­ing a string at every con­cate­na­tion, it just cre­ates one big string and puts the re­sults in­side by do­ing:

fixed(char *pDest = &dest.m_firstChar)
    fixed (char *pSrc = &src.m_firstChar) {
        wstrcpy(pDest + destPos, pSrc, src.Length);
    }

This means that fol­low­ing the guide­lines with­out un­der­stand­ing what hap­pens un­der the hood could lead to wrong de­ci­sions. In the ex­am­ple, it leads to code which is more dif­fi­cult to read and main­tain, which is slow­er, and which pre­vents the com­pil­er from op­ti­miz­ing it.

Ques­tion­ing styl­is­tic choic­es (the rules which are at the left in the mod­el I was talk­ing about at the be­gin­ning of the ar­ti­cle) is out of ques­tion. Those choic­es should be done at the be­gin­ning of the pro­ject, made manda­to­ry, and nev­er ever ques­tioned since then. What you should ques­tion, on the oth­er hand, is the rea­son­ing be­hind the rules which are on the oth­er side. Was the au­thor skilled enough to draft those rules? Are they still valid? Do they ap­ply to any sit­u­a­tion, or just some edge cas­es? For in­stance, Crock­ford's Javascript: The Good Parts is ex­act­ly what I ex­pect from any au­thor of a cod­ing stan­dard: an in-depth ex­pla­na­tion of why jslint's stan­dards are the way they are. Not read­ing the book and not un­der­stand­ing the un­der­ly­ing rea­son­ing makes de­vel­op­ers adopt tools such as jshint which cause bugs in their code. Un­der­stand­ing what's hap­pen­ing un­der the hood helps tak­ing the right de­ci­sions.