YAGNI principle in action
Discussing YAGNI with colleagues, I often find myself in a situation where they give me a concrete example where they were very happy to think about a given feature or aspect before it was really needed. As those discussions always follow the same logic, I believe it would be easier to just write an article and refer to it later on.
Yesterday, a colleague came with a very concrete example where he found that applying YAGNI wasn't a good idea. He was working on a web application. The application grew naturally: it started small—about a dozen of users who were friends of the developer—and a few years later counted about two thousand active users. This growth led to two issues my colleague had to deal with: the bandwidth usage and the confidentiality of the data. When the application was small, things were easy: it was hosted on a single EC2 instance and nobody really cared how the information was stored and transmitted. But more users meant that a few extra zeros appeared in the Amazon's invoice, and that people started wondering what's really happening with the data under the hood.
This had to lead, among others, to two purely technical measures. First, the resources had to be compressed; previously, text resources such as HTML or JSON were transmitted uncompressed, leading to extra bandwidth being wasted. Second, the application had to be moved from HTTP to HTTPS.
As I accompanied my colleague with the migration to HTTPS, knowing a bit more than him about, for instance, the HTTP headers that he needs to use to increase the security, we started talking about how difficult is it to do those two changes which are expected to be so simple. He complained, in fact, that he spent three days implementing compression, and had difficulties with HTTPS as well. “If only I have thought about compression and HTTPS from the very start of the project!”—he told me.
Well, the “if only” reasoning is quite the opposite of YAGNI, and it is in part the “if only” logic which made those two changes so difficult in the first place.
I wondered, why was it so difficult for him to add compression. I mean, it's just a bunch of lines to add, maybe some configuration to change. So he showed me the details, and indeed, the contraption wasn't easy to tame. Analyzing the situation, we identified three difficulties:
He was using a sort of a library which was put between the web server and his Ruby on Rails application. This library didn't play well with Ruby on Rails, causing trouble when my colleague attempted to add compression.
Static content wasn't served through a CDN, or delegated to the web server, or at least made part of the application itself. Instead, a third-party library, which seem to be quite fashionable in Ruby's community, was used specifically for static content. Unfortunately, this library claimed to implement compression by default, but for some reason, it didn't on EC2 specifically (although it worked locally), and it required hours of debugging to understand that the problem was an improbable set of circumstances including a typo, a version mismatch, and an incorrect configuration which was silently swallowed without producing any errors in the logs.
There were at least four ways to serve content. Each way had its own quirks and specificities.
It was time now to ask ourselves why those difficulties exist in the first place.
Third-party libraries and other fashionable things
The first two ones were simple. The choice to use those third-party libraries was made in order to simplify the project, by delegating the work to them. This is a great thing when the benefits of the library outperform its cost. In this current case, however, those libraries just added an extra level of complexity and required extra code. I'm not saying that those libraries are perfectly useless; what I assert, however, is that in this project, they were an overkill. Both libraries were large and were trying to handle a tremendous amount of cases and situations. The project didn't use even 1% of those libraries, while incurring their cost. This is a clear violation of YAGNI.
What my colleague should have done, originally, is to start small. A project which would be used by a dozen of persons doesn't need the fancy frameworks and libraries which are fashionable at the moment. Start simple, and add dependencies only when you clearly see how they would benefit the project.
It seems a simple rule, but I notice that more and more junior programmers imagine a new project as an orchestrator of fancy libraries and technologies, rather than a tool which does a given set of tasks. In some communities, this tendency is beyond sanity. It seems today that every web application should start with Angular or React or any other fancy thing, and necessarily ship with dozens or even hundreds of third-party libraries from the day zero. The result is just clumsy applications which perform poorly and require megabytes of bandwidth for the most elementary thing. If third-party libraries are not enough, infrastructure seems to go the same way too. A few days ago, discussing with another colleague one of my projects, I mentioned that the project has a simple home made queue system. As soon as he heard the word “queue,” he started telling that I need to setup an Apache Kafka cluster. This is like seeing someone using
System.Collections.Generic.Queue<T> in the source code, and claiming that he really needs to stop doing that, and start relying on RabbitMQ.
At this point of the discussion, most programmers respond: “Well, that's all nice, but I obviously wouldn't start with, say, an in-memory dictionary to cache some data, when I already know from the beginning that I need a distributed cache solution. I don't want to rewrite my home-made solution later, and neither should you.”
The fact is, in most cases, you think you need a given technology, but you don't know it for sure, and especially you don't know the details.
Here's an example. I spend a lot of my personal time designing REST services. I have a lot of them, some small, some large. In many cases, when I start working on a new service, I have no objective information about its scale, nor do I know all the features it would have. So I start small: if the service needs to store data, I don't automatically provision a bunch of machines for a PostgreSQL or a MongoDB cluster. Instead, I use plain files, often coupled with a very lazy approach where, for instance, I would load an entire collection of objects when I just need one. This is by no means a good example of performance, but I simply don't care, because a service which processes twenty queries per hour won't be my next headache even if every request loads one megabyte of data, instead of a few bytes.
Later on, some services grow and need a real database. Some of you may say: “Hey, see, I told you so; now you're rewriting the data access layer, instead of doing something useful.” But you forget that it took a few minutes to write the “data access layer” in the first place. For the sake of simplicity, I'm ready to sacrifice a few minutes of my time. You also forget another important aspect: since I ran the service in production for some time, and since I now have a clear understanding of how the service is used, I know exactly how to structure the database, and what would make more sense: a relational database, or some NoSQL solution. And that vision saves me hours for small projects, and could save men-years to big corporations working on large projects.
Other services stay small, and are perfectly happy storing their data in plain JSON or XML files. Obviously, this is not as sexy as Cassandra, or Elasticsearch, or I don't know what else is fashionable today, but it has one thing: it solves a given problem in the most simple way. And when you'll be scratching your head about the way to migrate your Elasticsearch cluster from version 6 to version 7 and call it chore in Git, I would meanwhile be working on something which brings the actual value.
You aren't gonna need this piece of code
If dependencies weren't enough, my fellow programmer was also adding extra code to handle cases which wouldn't exist. As I was telling, he ended up with four different ways to serve content. One of them was dealing with large JSONs. Originally, there was a suspicion that some very specific element in the application would grow, and so its JSON representation will become quite large. I spent at least half an hour trying to figure out what exactly is large. It was so confusing that the colleague was obstinately refusing to give me even a rough estimate, but ended up admitting that we are not talking about gigabytes, nor hundreds of megabytes, but possibly about a few megabytes of text.
His concern was, so, that large JSON would be problematic for the memory usage on the server, and allow an attacker to exploit it to cause a denial of service by simply filling all the memory on the server. So there was a custom code to serialize the object to JSON in parts, and then flush those parts to the client as they are generated. The result was a piece of 200 LOC of code. Having worked on chunked upload transfers for Flask, I must admit that 200 LOC doesn't sound excessive for this problem.
However, the problem itself simply doesn't exist. Despite the growth of popularity of the application, the largest JSON response for the past month, according to the logs, is 400 KB. Similarly, right now, the bottleneck on EC2 is not the memory, which is used at 25% in average, with occasional peaks at 40%, but the CPU (the application does perform some interesting stuff which requires computing power and could easily cap the CPU at 100% for a while). Despite this possibility to perform a denial of service, there were no identified attempts to do so, and a few attacks were targeting the passwords of the users. This one is pretty funny, given that the application doesn't store user's passwords.
Another part was dealing with dates. The web application was providing an API which was used not only by the application itself, but also by a report service. In order for this report service to work correctly, a separate JSON serialization mechanism was made in order to handle the idiosyncrasy of textual representation of timezones. I'm sad to tell that despite its ugliness, this part has its reason to be here, and if my colleague was following YAGNI, the result would still be the same.
A third part was dealing with Unicode. There was in the application a specific part where Unicode characters were accepted, and only there—users could put Unicode characters in their names. The problem with Unicode was a mix of the database quirks and an understandable lack of experience of my colleague with Unicode. Now the funny part. The database quirks were solved in the next version of the database. But having this partial Unicode support was a choice my colleague made four months before the release of the new version of the database. The first user with Unicode characters in his name was registered… seven months later. Would my colleague have waited, he could just upgrade the database, avoid making any hacks in the source code, and still welcome the new user with an Unicode name.
The impression that you're lucky that you thought about a given feature or technology months before you really needed it is misleading. The fact that right now, it would be difficult to implement the feature or technology, doesn't mean that you are right when you implement something for the needs which may appear in the future, but rather that your project is way to complex, likely because you didn't rely on YAGNI principle enough.
Cramming everything which is fashionable and designing software to do what you might want it to do in the future is not a correct way to create software products. It results, as shown by repeated experience, in bloatware, in difficult to maintain solutions, in products which are way more complex than they need to be.
If you can't clearly explain why you need such or such thing, forget about it. Move to a feature that you know that you need right now. Use your time to refactor code. To test it. And only when you absolutely know, objectively, that you need a message queue service, a database, a caching solution, a custom way to handle dates, a fancy library which would save you hundreds of lines of code while costing a few hours of programmers' time and a few minutes per year in terms of maintenance later, only then do what needs to be done.