Interfaces in microservices

Arseni Mourzenko

Founder and lead developer

177

articles

February 7, 2017

Observing large in-house systems which either use SOA approaches or even attempt to mimic microservices, I constantly notice a pattern which makes those systems sub-optimal and difficult to maintain.

A tiny Hello World-style application can easily have all its logic as a monolithic piece of code. It can incorporate some logic, and even be relatively well organized with a bunch of goto. As the application grows, maintenance would become more and more difficult. So the code will be split in sections, usually called functions. If the growth continues, the large number of functions become impossible to maintain effectively, so they become methods within classes. The number of classes grow, and it becomes necessary to organize them within the namespaces, which, in turn, are put in packages. Large products contain hundreds or even thousands of packages, which requires an additional level of organization, and services are one of them.

Take a private method within a class. How much documentation it needs? Since the method is hopefully short and is used by a relatively small number of persons (that is, developers who work on the given class), there is no need for lots of documentation: usually, a simple comment and the explanation of the arguments and the return value is largely enough. In the same way, if a mistake is made within the interface, it doesn't matter much—it is easy to change it later, and reflect the change in the class methods which call the concerned method. This leads us to conclude that at low compartmentalization levels, interfaces may be poorly documented and poorly designed.

When it comes to a class, the size of its interface (which is the sum of all the public methods within the class) and the potential cascading impact of the change of this interface means that one has to design this interface more carefully.

But what about packages and things such as services? Here, we deal with an even larger interface, which may be used by different teams within the company, or even by the actual customers if the package is distributed publicly or if the service is opened to public, making any change very difficult to propagate. Good design and good documentation become key, given the way the interface is used.

The compartmentalization of code through methods, classes, namespaces, packages and services has a bunch of limitations, including the fact that a change within a compartment could necessitate the change of its interface. Let's explain it through an example. An e-commerce website has a namespace which handles product data. Each product has an unique identifier stored as smallint, which means that a maximum of 32,767 products can be stored in database. The website scales, and business needs to store hundreds of thousands of products; moreover, it is deemed unpractical to have auto-incremented identifiers, and business decides to go for globally unique IDs which in technical terms means using UUIDs. This type change affects not only the package which accesses the products table, but also its interface, and, as a direct consequence, the packages which rely on it. A typeless interface could prevent such change to propagate to the callers.

This leads us to a notion of the ease of propagation of changes. If an interface is transparent, this means that changes within the interfaced code will often lead to changes within the interface itself and propagate to other components. An opaque interface, however, will be more resilient to change within the interfaced component.

What makes an interface transparent or opaque? There are two factors which shift the opacity of an interface, acting from the one or the other side of the interface:

The abstraction in relation to the underlying code. An interface which is built based on business needs will be more stable compared to an interface which is built based on technical considerations. For instance, Amazon S3's interface gives no clue about the actual implementation of the service. The service could use PostgreSQL or CouchDB or Dynamo: one could know how the service is implemented only by having the insider knowledge and not through the introspection of the interface.
The abstraction in relation to the clients. An interface which is built based on the specific needs of the caller would have a high level of volatility. Since the caller's needs change over time, being dependent of the caller would necessarily influence the interface. Taking the previous example of Amazon S3, the service works the same way when storing videos for a video sharing website, or product descriptions for a product management system. If S3 was built specifically with a given e-commerce website in mind (should I mention its name?), not only it won't work well for other types of websites, but won't even fit well the needs of a different e-commerce website; it will also change a lot when the e-commerce website changes. Amazon's architects did a great job of abstracting the client away from the design of the interface, which means that the service can now be used by millions of clients with a good amount of diversity between them.

As you can guess, opaque interfaces are essential at services level. However, when I observe large in-house systems (especially the ones which try to look like microservices), interfaces are rather transparent in both directions: they change when the abstracted system changes, and they change when the caller's needs evolve. A major reason is that both the caller and the callee are developed by the same company, and are used in pair. This leads to a difficulty of putting a decent level of abstraction, given that both sides are within reach. The resulting volatility of the interface makes it difficult to properly document (and keep the documentation up to date), and even properly architect it in the first place; thus, we end up with poorly done, cryptic and constantly changing interfaces.

Is it bad? Absolutely. The only goal of compartmentalization is to decrease complexity of a system: a developer can focus on a given component, without having to know the other components. A poorly done, cryptic and volatile interface prevents this—most developers working on those in-house systems find themselves working on not one, but multiple components at the same time. There are other negative side effects. For instance, proper versioning is missing in practically every case: since interfaces change too often with changes propagated to the callers, it's practically impossible to keep up with the pace of change.

The solution? Designing the services as if they were public (or actually make them public). If comparison can be made, at Pelican Design & Development, the services were always created with public usage in mind. While this has a lot of benefits in terms of security and other aspects, it also helps focusing on important thing—the service interface itself. The fact that the service is public means that I won't build it for my specific usage case; instead, it's built considering exclusively the business needs of the service itself, not one specific client. It also means that the service ends up with well-designed interface and high-quality documentation, which helps tremendously myself when I have to develop another client six months ago.