Did you know that Ruby is faster than C?
In the past few days, there was on Programmers.SE at least three questions (example) which can be summarized as: “Which programming language is faster, this one or that one?”
This is annoying. I already asserted once that it makes no sense to compare the speed of two languages. Is German faster than Japanese? It seems that the analogy is not enough, so let's discuss the subject in more details.
Statistical data
Let's admit that the comparison between two languages can be done. For instance, every application written in Slow++ are slow as hell, and when original authors rewrite the application in C++, every time the application performs much faster.
It's obvious that C++ is “faster” than Slow++, isn't it? Is it?
I've done several rewrites. Every time the rewritten variant was faster, because, hopefully, at the moment of the rewrite, I'm more skillful than I was one year before, so the new code I write is faster than the old one. The language doesn't matter. My skills do.
OK. Let's imagine that apps rewritten from Slow++ to C++ become faster, while every app rewritten from C++ to Slow++ appears to be slower. This would be a good metric to use as statistical data. The fact is, there are no too much apps which were rewritten in two directions for two languages.
The sole fact of rewrite is problematic in terms of performance. Is it a simple 1:1 rewrite where authors intentionally tried to keep the algorithm identical? Or is it a freestyle rewrite where a guy knowing well a language wrote an implementation of something, similarly to another implementation in a different language written by a guy skillful in that language?
In the case of a 1:1 rewrite, the comparison makes no sense. You can't just do that and expect good performance results. Different languages have different optimization techniques you should account for, so instead of comparing two apps in different languages, you are actually comparing a fast algorithm written in one language with a non-optimized one in another language.
In the case of a non 1:1 rewrite, what proves that the authors made enough efforts in profiling and optimizing their code in both languages? Are they skillful enough? Have they used a good profiler?
Much of my Python apps will probably be slow than their C# equivalents. This is expected, because I don't know Python enough, and I don't even know yet how to use a profiler in Python. The performance of my Python code has nothing to do with the language: the only problem is me.
There are two other aspects which make things worse:
A language doesn't exist in a vacuum. There is a framework. An OS. A compiler. For instance, using the same language but different compilers may create a tremendous difference. How would you account for that in statistical data?
What is actually measured? Languages such as Java or C# use Just-in-time compiling. Languages such as JavaScript are interpreted (unless they are compiled first). Should JIT compiling or the interpretation be part of statistical data? If yes, why excluding compilation itself?
If there is a way to get any relevant statistical data, the task is at least very complicated.
What's the point?
The worst point about those language comparison questions is that they have no point. For me, it's like spending hours talking about the merits of tabs versus spaces: the answer—if there is an answer—is simply irrelevant.
The question about the languages can be asked in two contexts:
The person is happy to know that his preferred language is the most beautiful, fast and colorful language in the world.
Of course your preferred language is the best in the world. Now go learn another one to be able to improve your skills in your preferred language and not being mentally limited by it.
The person starts a new project and needs to chose a programming language.
Such choice makes no sense. In fact, the choice of a language in general makes no much sense for business applications: any language will do the job. The sole motivation of picking a specific language is the skills of the team members. If all team members spent decades programming in C++, it would be stupid to chose Ruby for the new project.
In an ideal world where developers have infinite time to create an ideal project, they will learn any existent language, implement the project in all those languages and then compare. In real world, the deadline won't permit that, so the optimal thing to do is to:
- Implement an app in the preferred language—the one the team knows very well,
- Find the bottlenecks through profiling,
- Solve the bottlenecks with optimization techniques.
It may appear that every optimization technique was used but the bottleneck is still here. In that case, moving this part of the app to low-level language, including Assembler, can be a valid choice. But this practice is irrelevant when making the original choice of a language, because it is not an optimal solution to write everything in Assembler.
Of course, an entire rewrite is conceivable, and I believe Google does that. Google's approach, as I understand it, is very interesting. They use Python for new projects, because it allows them to release fast. Then, if performance becomes critical, the application may be rewritten in Java or C++.
This makes complete sense. Writing apps in C++ in the first place would be a bad choice: longer releases and higher efforts are not a good thing for the first version. On the other hand, rewriting in C++ later makes it possible to base the work on an existent product: with requirements or architecture done, the team can focus on the actual quality of the code, not a bad thing for the next version of a product.