The complexity of API keys

Arseni Mourzenko
Founder and lead developer
June 22, 2015
Tags: security 8

When it comes to han­dling API keys, there are clas­si­cal­ly two sit­u­a­tions which are a bit prob­lem­at­ic: pub­lic shar­ing of a key and man in the mid­dle. Look­ing at the ap­proach­es by Ama­zon, Google and oth­er big com­pa­nies, I find it both over­ly com­plex and not very flex­i­ble, so I want to sug­gest a very sim­ple ap­proach to a not so sim­ple prob­lem.

Key dis­tri­b­u­tion

As soon as keys start to be used “out­side the box,” out­side re­strict­ed en­vi­ron­ment where only lim­it­ed num­ber of high­ly trust­ed per­sons—sys­tem ad­min­is­tra­tors and IT op­er­a­tions staff—can ac­cess them, those keys are sub­ject to reuse and mis­use. The case pre­sents for every AJAX-based API where the key is sim­ply there, in plain text, view­able by any­one, but also in­cludes sit­u­a­tions where the key is shared be­tween mul­ti­ple ap­pli­ca­tions which, all, ac­cess the same API.

The prob­lem is es­sen­tial­ly the con­trol of the en­ti­ties who have (or had in the past) ac­cess to the API key. If there is only one key which is shared among the con­sumers, the au­thor­i­ty have no con­trol over it: there is no way to tell who is us­ing the key, and the only way to re­voke ac­cess it to change the key it­self, which leads to a re­new­al of the key for every le­git­i­mate con­sumer.

The so­lu­tion is to have a two-parts key. One part is al­ways the same: it is gen­er­at­ed by the cen­tral au­thor­i­ty, or CA (that is the ser­vice which is in charge of all keys) to the in­ter­me­di­ary au­thor­i­ty, or IA. The IA now has a key which, as is, can­not be used pub­licly. On the oth­er hand, the IA can ap­pend to it the sec­ond part which is gen­er­at­ed by the IA it­self (even­tu­al­ly with the help of CA in or­der to avoid im­ple­men­ta­tion mis­takes re­lat­ed to the cryp­to­graph­i­cal­ly se­cure pseu­do-ran­dom num­ber gen­er­a­tor). This sec­ond part is unique for every con­sumer, while the first one re­mains the same.

The mag­ic be­hind it is that when the con­sumer is us­ing a key, the CA re­ceives the com­plete key—both the first and the sec­ond part. The CA val­i­dates the first one, as the only valu­able in­for­ma­tion for the CA. On the oth­er hand, the sec­ond part of the key has two roles:

The ben­e­fit of this ap­proach is that IA, with­out hav­ing the role of han­dling the keys and grant­i­ng ac­cess­es—this is the role of the CA, have still the pos­si­bil­i­ty to put spe­cif­ic re­stric­tions in or­der to avoid the key to be mis­used. The fact that all the keys start with the same first part makes it very easy to iden­ti­fy the source of the keys and to as­so­ci­ate them to a spe­cif­ic IA.

Man in the mid­dle

Deal­ing with keys which are not shared to the pub­lic, but kept on servers and ac­cessed only by trust­ed sys­tem ad­min­is­tra­tors is very straight­for­ward when deal­ing with a sin­gle ser­vice ac­cess­ing an­oth­er one, but starts to be pret­ty hairy in a con­text of a mi­cro-ser­vices in­fra­struc­ture where keys could be passed to ser­vices which can hard­ly be trust­ed.

A ba­sic sce­nario is this one. Imag­ine a stor­age ser­vice which has its con­sumers with, for each con­sumer, a key. Now, an im­age shar­ing ser­vice re­lies on stor­age ser­vice to store im­ages. The im­age shar­ing ser­vice has its own con­sumers which should some­how be re­flect­ed in stor­age ser­vice: a con­sumer who stored an im­age through the im­age shar­ing ser­vice should also be able to find this im­age when us­ing the stor­age ser­vice di­rect­ly.

This sit­u­a­tion makes it un­prac­ti­cal for im­age shar­ing ser­vice to be an ac­tu­al cus­tomer of stor­age ser­vice, since it would force to cre­ate a sort of a su­per-user ac­count with high priv­i­leges, such as the abil­i­ty to set the own­er of a file. This sucks in terms of se­cu­ri­ty, and adds tremen­dous com­plex­i­ty.

On the oth­er hand, pass­ing the ac­tu­al stor­age ser­vice API keys from the end cus­tomer through the im­age shar­ing ser­vice down to the stor­age ser­vice is not an op­tion ei­ther. What if the im­age shar­ing ser­vice is com­pro­mised? What if a dis­grun­tled em­ploy­ee who man­ages the im­age shar­ing ser­vice de­cides to store the API keys which pass through?

The so­lu­tion is sim­ple: the ser­vice in charge of keys val­i­da­tion uses pub­lic-pri­vate pair of keys, mean­ing that the API keys which pass through the in­ter­me­di­aries are en­crypt­ed.

Now is a good op­por­tu­ni­ty to in­tro­duce the con­cept of a pas­sive in­ter­me­di­ary. A ser­vice which acts as a pas­sive in­ter­me­di­ary is dis­charg­ing it­self from the ver­i­fi­ca­tion of the API keys. It doesn't care about the keys: it mere­ly trans­mits those keys to the un­der­ly­ing ser­vice. This doesn't mean that it should be trust­ed; this sim­ply means that the con­sumer of this ser­vice doesn't have to have a ded­i­cat­ed API key for this ser­vice, but may have to have a key for the un­der­ly­ing one. In the ex­am­ple above, the im­age shar­ing ser­vice would just del­e­gate the check­ing of the API key to stor­age ser­vice: every­one who can use the stor­age ser­vice can au­to­mat­i­cal­ly use the im­age shar­ing ser­vice as well.

This doesn't mean that the pas­sive in­ter­me­di­ary can­not rely on the con­sumer name. It may trust the re­sponse from the un­der­ly­ing ser­vice in or­der to know whether the con­sumer is a le­git­i­mate one or not.

There are two ben­e­fits of be­ing a pas­sive in­ter­me­di­ary.

Some­times, the pas­sive in­ter­me­di­ary (or any in­ter­me­di­ary in gen­er­al) has to be au­then­ti­cat­ed by the un­der­ly­ing ser­vice as well. Some oth­er ser­vices may ac­cept anony­mous in­ter­me­di­aries, since the en­crypt­ed API keys of the con­sumers are al­ready a good pro­tec­tion against abuse and iden­ti­ty theft. It's up to the ser­vice to de­cide whether the ad­di­tion­al pro­tec­tion is need­ed, giv­en that the choice doesn't af­fect the end con­sumers, since such au­then­ti­ca­tion hap­pens be­tween two ser­vices only.

Over­all pic­ture

While it might look that the mod­el de­scribed in Key dis­tri­b­u­tion and the one de­scribed in Man in the mid­dle are dif­fer­ent, they are just two sides of a sin­gle sys­tem based on three rules: