And what if most projects were research projects?

Arseni Mourzenko
Founder and lead developer
177
articles
January 19, 2015
Tags: management 34 communication 27

When I start­ed to work on the lat­est pro­ject re­quir­ing skills in Lin­ux, Python and Node.js, that is three do­mains I nev­er used be­fore, I was very clear with my­self: the pro­ject will take what it will take, maybe two months, maybe six, maybe three years.

While I had the pro­ject roadmap and a goal, I had no idea where would it get me pre­cise­ly. Day af­ter day, the pro­ject was mov­ing to­wards, some­times back­wards as well, and day af­ter day I was fac­ing a risk of be­ing blocked on some is­sue for weeks or months. And I was, sev­er­al times, but I couldn't care less. Well, I cared, be­cause it was some­times ex­as­per­at­ing, like my ex­pe­ri­ence with MAAS: af­ter los­ing three weeks, I was forced to aban­don it, since it sim­ply doesn't work. At all.

With­out too much Lin­ux knowl­edge, I ob­vi­ous­ly found my­self in WTF mode more of­ten than on or­di­nary pro­jects (i.e. the ones where you don't learn any­thing). Bad things hap­pened, some­times be­cause of the bugs in Lin­ux ap­pli­ca­tions, of­ten be­cause of my lack of knowl­edge. Rarely, there were WTF mo­ments like the one with rsyslog:

I was mi­grat­ing every­thing to syslog in or­der to use a com­mon log­ging serv­er. It worked well for Apache, since it took me prob­a­bly less than a quar­ter of an hour to set­up the thing. Quick es­ti­ma­tion ex­er­cise: if it takes 15 min­utes to make Apache use syslog, how much time would you need to do the same change for Ng­inx?

In re­al­i­ty, Ng­inx ap­peared much more re­luc­tant. I fi­nal­ly dis­cov­ered that I was us­ing a ver­sion which doesn't sup­port log­ging to syslog yet, and can­not pos­si­bly use a ver­sion which does.

It ap­peared, on the oth­er hand, that rsyslog is able to mon­i­tor files and im­port data from them. Great. Or maybe not, since hours of at­tempts led me to a headache, and noth­ing more. rsyslog was re­fus­ing to log any­thing from Ng­inx logs, while ac­cept­ing to get the data from oth­er files.

I checked every­thing. Own­er/group, per­mis­sions, echoing to the end of the file (since it changes the file date, while Ng­inx log­ging doesn't). Fi­nal­ly, I found that changes in every file in the /var/log/nginx/ di­rec­to­ry are silent­ly ig­nored by rsyslog, while changes out­side the di­rec­to­ry (in­clud­ing when Ng­inx is log­ging di­rect­ly to, say, /var/log/nginx-access.log) are tak­en into ac­count. I don't freak­ing be­lieve this!

Sit­u­a­tions like this make it ob­vi­ous that it is ab­solute­ly im­pos­si­ble to make the pro­ject sched­ule. When a tiny task sim­i­lar to one which took fif­teen min­utes takes hours of head-bang­ing, there is no way I can tell that I can de­liv­er the pro­ject in four weeks. Or eight months. Or three years.

Many pro­jects are sim­i­lar, shar­ing two es­sen­tial as­pects: (1) the roadmap is un­clear and (2) shit hap­pens.

The roadmap is un­clear

Most pro­jects are in­her­ent­ly un­clear. It doesn't mat­ter what the guy who wrote the re­quire­ments tells, and it doesn't mat­ter that the cus­tomer en­sured you that he has a clear idea of the fu­ture prod­uct. Clear pro­jects re­quire clear method­ol­o­gy that most IT peo­ple don't know (the next time your man­ag­er will rant about his ex­per­tise and the decades he spent in the in­dus­try, ask him what is the dif­fer­ence be­tween a func­tion­al and non-func­tion­al re­quire­ment).

It's not that bad. What is bad is that man­age­ment of­ten han­dles a pro­ject as if the roadmap was clear. A boss may de­cide that the pro­ject will be re­leased with ex­act­ly those fea­tures in ex­act­ly three months, and when, eight months lat­er, the pro­ject is still not re­leased, the boss is em­bar­rassed, be­cause he kin­da promised some­thing to some­one. Or an IT firm may sign a con­tract telling that the cus­tomer will pay $80,000 for a spe­cif­ic pro­ject, and when six months lat­er, all of the fea­tures re­quest­ed by the cus­tomer have noth­ing to do with the ini­tial vi­sion, lawyers start to make phone calls to de­cide who should pay for the mess.

I'm not sure why so many peo­ple are dis­gust­ed by Ag­ile so much that they are call­ing them­selves Ag­ile while man­ag­ing the un­clear pro­jects like they were per­fect­ly clear.

Shit hap­pens

It doesn't even mat­ter how clear the roadmap is, since the im­ple­men­ta­tion it­self is of­ten hard to es­ti­mate.

I'm quite skill­ful in C# and .NET. This also means that my es­ti­mates of a one-man pro­ject be­came quite pre­cise over time. Usu­al­ly, if I tell that it will take me five weeks to de­vel­op a web app which has a clear roadmap, it means that in three to five weeks, it will be up and run­ning. But even .NET has its caveats, and I know that when­ev­er the pro­ject con­tains some­thing even weak­ly re­lat­ed to WCF, I have to be very care­ful with my es­ti­mates, be­cause the pro­ject can eas­i­ly slip by one or two weeks be­cause of WCF con­fig­u­ra­tion night­mare.

I'm not quite skill­ful in Lin­ux, Python and Node.js, which ex­plains the num­ber of WTF cas­es I've en­coun­tered in the last pro­ject.

Not only are there true WTF sit­u­a­tions, but also dead ends. I re­call work­ing for more than a week set­ting up a mes­sage queue ser­vice which would be used, among oth­ers, for log­ging. If only I knew that re­mote syslog is de fac­to stan­dard among Lin­ux sys­tem ad­min­is­tra­tors! While mov­ing from the mes­sage queue ser­vice to syslog was a mat­ter of min­utes thanks to the de­cent ar­chi­tec­ture, the mes­sage queue ser­vice dead end was still a waste of time in the first place.

The morale here is not that I should have been us­ing what al­ready ex­ists in­stead of rein­vent­ing the wheel. I didn't know how much re­mote syslog was a stan­dard and how sim­ple and re­li­able (in­clud­ing, sur­pris­ing­ly, with UDP) it is. Fol­low­ing “don't rein­vent the wheel” rule at all costs can lead to a dead end too, by the way: all the time I wast­ed try­ing to make MAAS work is a good il­lus­tra­tion of that.

Con­clu­sion

When I start­ed to work on the lat­est pro­ject, I was very clear with my­self: this is an ex­per­i­men­tal pro­ject, I don't know how much time would it take, and that's clear­ly not the point. The fact that I had no skills in three main as­pects of the pro­ject was also an el­e­ment which en­cour­aged my feel­ing that this pro­ject is ex­cep­tion­al and is made pos­si­ble ex­clu­sive­ly be­cause the per­son who de­cides and the per­son who im­ple­ments the pro­ject are a sin­gle per­son.

But in ret­ro­spec­tive, it ap­pears that the pro­ject is not that ex­cep­tion­al: most pro­jects I've seen are un­clear and made in a con­text where dif­fer­ent events will hap­pen, of­ten mak­ing the orig­i­nal es­ti­mate com­plete­ly ir­rel­e­vant.

The sole dif­fer­ence, af­ter all, is the man­age­ment adapt­ed to the style of the pro­ject. When many man­agers will ex­pect im­pos­si­ble dead­lines from their teams and fire de­vel­op­ers in or­der to show to their boss­es that they take se­ri­ous­ly enough the prob­lems they cre­at­ed them­selves in the first place, I, on the oth­er hand, do what any oth­er man­ag­er should be do­ing: be very clear that there are things which are out­side of our con­trol be­cause of the na­ture of the pro­ject, and man­age it ac­cord­ing­ly.