Did you know that Ruby is faster than C?

Arseni Mourzenko
Founder and lead developer
170
articles
February 10, 2015
Tags: productivity 36 rant 34 performance 13

In the past few days, there was on Pro­gram­mers.SE at least three ques­tions (ex­am­ple) which can be sum­ma­rized as: “Which pro­gram­ming lan­guage is faster, this one or that one?”

This is an­noy­ing. I al­ready as­sert­ed once that it makes no sense to com­pare the speed of two lan­guages. Is Ger­man faster than Japan­ese? It seems that the anal­o­gy is not enough, so let's dis­cuss the sub­ject in more de­tails.

Sta­tis­ti­cal data

Let's ad­mit that the com­par­i­son be­tween two lan­guages can be done. For in­stance, every ap­pli­ca­tion writ­ten in Slow++ are slow as hell, and when orig­i­nal au­thors rewrite the ap­pli­ca­tion in C++, every time the ap­pli­ca­tion per­forms much faster.

It's ob­vi­ous that C++ is “faster” than Slow++, isn't it? Is it?

I've done sev­er­al rewrites. Every time the rewrit­ten vari­ant was faster, be­cause, hope­ful­ly, at the mo­ment of the rewrite, I'm more skill­ful than I was one year be­fore, so the new code I write is faster than the old one. The lan­guage doesn't mat­ter. My skills do.

OK. Let's imag­ine that apps rewrit­ten from Slow++ to C++ be­come faster, while every app rewrit­ten from C++ to Slow++ ap­pears to be slow­er. This would be a good met­ric to use as sta­tis­ti­cal data. The fact is, there are no too much apps which were rewrit­ten in two di­rec­tions for two lan­guages.

The sole fact of rewrite is prob­lem­at­ic in terms of per­for­mance. Is it a sim­ple 1:1 rewrite where au­thors in­ten­tion­al­ly tried to keep the al­go­rithm iden­ti­cal? Or is it a freestyle rewrite where a guy know­ing well a lan­guage wrote an im­ple­men­ta­tion of some­thing, sim­i­lar­ly to an­oth­er im­ple­men­ta­tion in a dif­fer­ent lan­guage writ­ten by a guy skill­ful in that lan­guage?

In the case of a 1:1 rewrite, the com­par­i­son makes no sense. You can't just do that and ex­pect good per­for­mance re­sults. Dif­fer­ent lan­guages have dif­fer­ent op­ti­miza­tion tech­niques you should ac­count for, so in­stead of com­par­ing two apps in dif­fer­ent lan­guages, you are ac­tu­al­ly com­par­ing a fast al­go­rithm writ­ten in one lan­guage with a non-op­ti­mized one in an­oth­er lan­guage.

In the case of a non 1:1 rewrite, what proves that the au­thors made enough ef­forts in pro­fil­ing and op­ti­miz­ing their code in both lan­guages? Are they skill­ful enough? Have they used a good pro­fil­er?

Much of my Python apps will prob­a­bly be slow than their C# equiv­a­lents. This is ex­pect­ed, be­cause I don't know Python enough, and I don't even know yet how to use a pro­fil­er in Python. The per­for­mance of my Python code has noth­ing to do with the lan­guage: the only prob­lem is me.

There are two oth­er as­pects which make things worse:

  1. A lan­guage doesn't ex­ist in a vac­u­um. There is a frame­work. An OS. A com­pil­er. For in­stance, us­ing the same lan­guage but dif­fer­ent com­pil­ers may cre­ate a tremen­dous dif­fer­ence. How would you ac­count for that in sta­tis­ti­cal data?

  2. What is ac­tu­al­ly mea­sured? Lan­guages such as Java or C# use Just-in-time com­pil­ing. Lan­guages such as JavaScript are in­ter­pret­ed (un­less they are com­piled first). Should JIT com­pil­ing or the in­ter­pre­ta­tion be part of sta­tis­ti­cal data? If yes, why ex­clud­ing com­pi­la­tion it­self?

If there is a way to get any rel­e­vant sta­tis­ti­cal data, the task is at least very com­pli­cat­ed.

What's the point?

The worst point about those lan­guage com­par­i­son ques­tions is that they have no point. For me, it's like spend­ing hours talk­ing about the mer­its of tabs ver­sus spaces: the an­swer—if there is an an­swer—is sim­ply ir­rel­e­vant.

The ques­tion about the lan­guages can be asked in two con­texts:

  1. The per­son is hap­py to know that his pre­ferred lan­guage is the most beau­ti­ful, fast and col­or­ful lan­guage in the world.

    Of course your pre­ferred lan­guage is the best in the world. Now go learn an­oth­er one to be able to im­prove your skills in your pre­ferred lan­guage and not be­ing men­tal­ly lim­it­ed by it.

  2. The per­son starts a new pro­ject and needs to chose a pro­gram­ming lan­guage.

Such choice makes no sense. In fact, the choice of a lan­guage in gen­er­al makes no much sense for busi­ness ap­pli­ca­tions: any lan­guage will do the job. The sole mo­ti­va­tion of pick­ing a spe­cif­ic lan­guage is the skills of the team mem­bers. If all team mem­bers spent decades pro­gram­ming in C++, it would be stu­pid to chose Ruby for the new pro­ject.

In an ide­al world where de­vel­op­ers have in­fi­nite time to cre­ate an ide­al pro­ject, they will learn any ex­is­tent lan­guage, im­ple­ment the pro­ject in all those lan­guages and then com­pare. In real world, the dead­line won't per­mit that, so the op­ti­mal thing to do is to:

  1. Im­ple­ment an app in the pre­ferred lan­guage—the one the team knows very well,
  2. Find the bot­tle­necks through pro­fil­ing,
  3. Solve the bot­tle­necks with op­ti­miza­tion tech­niques.

It may ap­pear that every op­ti­miza­tion tech­nique was used but the bot­tle­neck is still here. In that case, mov­ing this part of the app to low-lev­el lan­guage, in­clud­ing As­sem­bler, can be a valid choice. But this prac­tice is ir­rel­e­vant when mak­ing the orig­i­nal choice of a lan­guage, be­cause it is not an op­ti­mal so­lu­tion to write every­thing in As­sem­bler.

Of course, an en­tire rewrite is con­ceiv­able, and I be­lieve Google does that. Google's ap­proach, as I un­der­stand it, is very in­ter­est­ing. They use Python for new pro­jects, be­cause it al­lows them to re­lease fast. Then, if per­for­mance be­comes crit­i­cal, the ap­pli­ca­tion may be rewrit­ten in Java or C++.

This makes com­plete sense. Writ­ing apps in C++ in the first place would be a bad choice: longer re­leas­es and high­er ef­forts are not a good thing for the first ver­sion. On the oth­er hand, rewrit­ing in C++ lat­er makes it pos­si­ble to base the work on an ex­is­tent prod­uct: with re­quire­ments or ar­chi­tec­ture done, the team can fo­cus on the ac­tu­al qual­i­ty of the code, not a bad thing for the next ver­sion of a prod­uct.