Simplifying systems by adding proper abstractions

Arseni Mourzenko
Founder and lead developer
177
articles
April 26, 2015
Tags: devops 2 short 50 refactoring 12

The orig­i­nal draft of Solange pro­ject de­fined a no­tion of pro­file and in­stance. Ac­cord­ing to the to­day's doc­u­men­ta­tion (re­vi­sion 1234):

A pro­file is a gener­ic de­scrip­tion of a ma­chine, i.e. a set of con­fig­u­ra­tion items and op­er­a­tions which de­fine a pre­cise type of ma­chines. For ex­am­ple, a pro­file may de­fine a proxy serv­er—any proxy serv­er. An­oth­er pro­file may de­fine a DHCP serv­er.

while:

An in­stance is a con­fig­u­ra­tion of a spe­cif­ic ma­chine. Not any ma­chine, but a pre­cise, IP-bound, ma­chine on a net­work. For ex­am­ple, DHCP failover 1 and DHCP failover 2 are two dif­fer­ent ma­chines, and so they cor­re­spond to two in­stances, but they share the same pro­file.

The dif­fer­ence was so es­sen­tial, that even the pro­ject it­self had two di­rec­to­ries: one for dif­fer­ent pro­files with settings.json file, ini­tial­iza­tion script and oth­er pro­file-re­lat­ed files, an­oth­er one for in­stances with a struc­tural­ly dif­fer­ent settings.json file, dif­fer­ent ini­tial­iza­tion script and oth­er in­stance-re­lat­ed files.

Any ma­chine had nec­es­sar­i­ly one in­stance which de­clared a cor­re­spond­ing pro­file.

As it hap­pens with any rigid struc­ture, prob­lems start­ed to ap­pear, the ma­jor is­sue be­ing code du­pli­ca­tion. Sur­pris­ing­ly, I didn't have any dif­fi­cul­ty to de­ter­mine whether I should use in­stance or pro­file for a spe­cif­ic file, ini­tial­iza­tion com­mand or con­fig­u­ra­tion op­tion—I would ex­pect to have prob­lem­at­ic cas­es more than once, but I didn't. On the oth­er hand, the fact that there is only one in­stance and only one pro­file was re­al­ly an­noy­ing. Prac­ti­cal­ly iden­ti­cal pro­files end­ed up copy-past­ed, which was ob­vi­ous­ly ter­ri­ble for lat­er main­te­nance.

So it was de­cid­ed to im­ple­ment a hi­er­ar­chy—this is some­how Trust pro­ject was born. In­stances would re­main the same, but pro­files would have par­ent-child re­la­tion­ship, giv­en that a pro­file can have zero to one par­ent, but not two or more. If done, this would par­tial­ly solv­ing the prob­lem of code du­pli­ca­tion, but could also make it non-in­tu­itive to work with.

In par­al­lel came the idea of sup­press­ing the dis­tinc­tion be­tween pro­files and in­stances. Every in­stance will sim­ply in­her­it from a par­ent, or re­main stand­alone, pro­vid­ing every need­ed piece of info. This sim­pli­fied the struc­ture a bit, but still, giv­en my ha­tred of tree struc­tures, it couldn't pos­si­bly be that I would im­ple­ment such struc­ture in a core of the most im­por­tant pro­ject of my life.

This is where Trust came handy, bring­ing two strong points:

What's in­ter­est­ing about this is that by set­ting up prop­er ab­strac­tions, one can sim­pli­fy a sys­tem sub­stan­tial­ly. With pro­files and in­stances, Solange was dif­fi­cult to ex­plain to peo­ple who nev­er used it be­fore, and was frankly not that sim­pler com­pared to Chef and Pup­pet—it had its prop­er ter­mi­nol­o­gy, its prop­er way to struc­ture data—things which are not nec­es­sar­i­ly need­ed, but are still here for the sake of com­plex­i­ty. With the ab­strac­tion brought by Trust, Solange be­comes much more el­e­gant and easy to un­der­stand. It also be­comes more flex­i­ble: a small com­pa­ny with a few dozen of vir­tu­al ma­chines and a sin­gle sys­tem ad­min­is­tra­tor won't use the same way of or­ga­niz­ing data as a com­pa­ny with hun­dreds of thou­sands of vir­tu­al ma­chines main­tained by sev­er­al de­part­ments. And if this is not enough, the change will also make Solange's source code short­er and eas­i­er.