Aligning plain text with spaces and tabs

Arseni Mourzenko
Founder and lead developer
176
articles
April 3, 2019
Tags: prototyping 1

A fel­low de­vel­op­er asked me how to solve the fol­low­ing prob­lem. An ap­pli­ca­tion is dis­play­ing some data in a form of a grid. The user should be able to copy this data to a text field in a third-par­ty ap­pli­ca­tion. The text field has no sup­port for any pre­sen­ta­tion, and doesn't even sup­port Uni­code. De­spite this, the data should “look nice,” i.e. to look like if there were ac­tu­al columns, with text be­ing left-aligned in every col­umn.

With­out any pre­sen­ta­tion fea­tures, one has only tabs and spaces to align el­e­ments. To com­pli­cate the mat­ter, the third-par­ty ap­pli­ca­tion uses a pro­pri­etary font which can­not be freely down­loaded. The pro­pri­etary font can­not be changed, and is ob­vi­ous­ly not mono­space.

I sug­gest­ed the fol­low­ing ap­proach. In the sam­ple be­low, “Hel­lo, World” uses more space than “Bye,” “2%” uses less space than “45%,” etc. Ar­rows at the top in­di­cate the tabs.

If I know the width of every char­ac­ter, for in­stance if I know that the let­ter “s” is eleven pix­els wide, while the dig­it “6” uses thir­teen pix­els, and the cap­i­tal “W” spans the whole twen­ty-two pix­els, I can know the size of the con­tent of a giv­en “cell.” From there, I can know which one is the longest in a giv­en col­umn. This would give me an in­di­ca­tion about the tab from which the next col­umn should start. I can then find the po­si­tion of the mid­dle of the tab just be­fore it: this is es­sen­tial­ly the tab which fol­lows the one where the longest text ends.

Every cell of the cur­rent col­umn should re­ceive as many spaces at the end of the text as need­ed to reach the mid­dle of the next tab. Why the mid­dle? Be­cause it re­duces the risk of be­ing off by a few pix­els: spaces are large enough, and so they don't make it pos­si­ble to be pix­el-per­fect when align­ing stuff. In­stead, spaces will just lead the cur­sor to a mid­dle of a tab, more or less a few pix­els, and then a tab char­ac­ter would de­fin­i­tive­ly move to the end of the tab.

Back to the ex­am­ple, “Hel­lo, World” ends just be­fore the tab 4, so one should tar­get the mid­dle of the tab 4 with the spaces. Sim­i­lar­ly, “45%” ends at the very be­gin­ning of the tab 6, which leads us to the mid­dle of the tab 7 as our tar­get.

First, the code loops through the rows mea­sures the size, in pix­els, of every cell, and stores this size for lat­er use. It also keeps track of the largest width for every col­umn.

Sec­ond, the code loops one more time through the rows, and this time adds the re­quired spaces. As pre­vi­ous­ly stat­ed, those spaces don't en­sure that the cells of the same col­umn will be all of the same size at a pre­ci­sion of one pix­el: in­stead, they just make sure there are enough spaces in or­der for the string to end some­where around the mid­dle of the tar­get tab. Here, for the first col­umn, the cell of the sec­ond row uses a few more pix­els than the cell of the first row.

When I ex­plained the the­o­ry to my col­league, he didn't seem too con­vinced. He thought that the ap­proach is too com­pli­cat­ed, and would take a long time to be im­ple­ment­ed. To prove that it's not, I tried to im­ple­ment the ac­tu­al al­go­rithm. It took about forty five min­utes.

But be­fore im­ple­ment­ing the al­go­rithm, I had first to mea­sure the space be­tween the char­ac­ters. This is rel­a­tive­ly easy to do in less then ten min­utes, but I pre­ferred the hard way: to ac­tu­al­ly spend near­ly two hours writ­ing a script which would do it for me (it's fun­nier than to do it man­u­al­ly, right?) But first, I de­cid­ed to do a fake screen­shot with some fake data, since I don't have ac­cess the real third-par­ty ap­pli­ca­tion or the real data.

Mak­ing the screen­shot

The screen­shot wasn't an easy one. First, I need­ed to se­lect a font, and Google has tons of them, so I wast­ed at least half an hour look­ing at all those nice fonts. The next trou­ble was that browsers use kern­ing when dis­play­ing texts. Kern­ing is when the space be­tween char­ac­ters de­pends on the ac­tu­al char­ac­ters to make the text look nicer. For in­stance, when the let­ter “V” is fol­low­ing the let­ter “A,” the space be­tween the let­ters can be re­duced: the top left cor­ner of “V” would ap­pear a few pix­els at the left of the bot­tom right of “A”. On the oth­er hand, if it fol­lows “M,” then it should have gen­er­ous space: we don't want its top left to be over the top right of “M.”

This is all nice, but also ac­tive­ly harm­ful for what I'm do­ing. In fact, if I mea­sure the width, in pix­els, of every char­ac­ter, and then use it to as­sume the width of the whole string as be­ing the sum of the widths of its char­ac­ters, I'll be wrong most of the time.

Hope­ful­ly, CSS 3 makes it pos­si­ble to dis­able kern­ing. That's nice.

In or­der to mea­sure the widths of char­ac­ters, the eas­i­est way is to in­sert the pipe (ver­ti­cal bar char­ac­ter) be­tween every char­ac­ter, i.e. to do this:

A|B|C|D|E|F|G|H|I|J|K|L|⋯|4|5|6|7|8|9|

Ob­vi­ous­ly, the set of char­ac­ters de­pends on the in­put: I have to in­clude cap­i­tal and small let­ters, dig­its, and all the sym­bols which could po­ten­tial­ly ap­pear in the in­put, in­clud­ing the space. I end­ed up with this page:

<link href="https://fonts.googleapis.com/css?family=Vollkorn" rel="stylesheet">
<div style="font-family: Vollkorn; font-size: 1.25em; font-kerning: none; letter
-spacing: 1px;">
<div>A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z|a|b|c|d|e|f|g|h|i|j|k|l
|m|n|o|p|q|r|s|t|u|v|w|x|y|z|0|1|2|3|4|5|6|7|8|9|.|,|%|-|+| |</div>
<div>AV.M|<br>VMA.|</div>
</div>

The last line checks that the kern­ing was dis­abled.

Tak­ing the screen­shot is easy: se­lect the <div> in the De­vel­op­er Tools, press Ctrl+Shift+P, type “screen­shot,” se­lect Cap­ture node screen­shot and en­joy. Here's what I got:

Mea­sur­ing

The next step is to ex­tract the width of every char­ac­ter from the screen­shot. While a rel­a­tive­ly easy task, it took me much longer than I would ex­pect, ba­si­cal­ly be­cause I was miss­ing a few de­tails at the be­gin­ning.

The script starts by read­ing the im­age, pix­el by pix­el. A par­ent loop walks through the im­age from left to right. A child loop walks from top to bot­tom for a giv­en col­umn.

The child loop's task is to de­ter­mine whether the col­umn con­tains a pipe char­ac­ter or some­thing else, such as one of the bars of the cap­i­tal “H.” This, it ap­peared, is not an ob­vi­ous task. Anti-alias­ing means that one won't have a black line; it won't even be gray, but some­thing... well... com­pli­cat­ed. How­ev­er, by ap­ply­ing sim­ple rules, one can have an al­go­rithm which works pret­ty well once ad­just­ed for a spe­cif­ic font and size. Here are its steps:

  1. Take only one of the red-green-blue chan­nels and ig­nore all oth­ers. In my case, I took blue, but any oth­er would work as well.

  2. Ig­nore every val­ue high­er than the thresh­old. In my case, I ad­just­ed it sev­er­al times, and end­ed up with a val­ue of 128.

  3. Ig­nore the sides of the im­age. This in­cludes the very edge of the ver­ti­cal lines: be­cause of anti-alias­ing, they have a very dif­fer­ent col­or from the re­main­ing part of the pipe.

  4. Take the first val­ue among the re­main­ing ones in a col­umn, and com­pare all oth­er re­main­ing val­ues in the col­umn. If at least one of the val­ues is dif­fer­ent enough, it's a sign that this is not a pipe.

Here's a part of the out­put of the Python script which per­forms those steps:

The ar­rows at the right in­di­cate where the script thinks the pipes are.

The script then out­puts a JSON string which con­tains the num­ber of pix­els for every char­ac­ter. This JSON looks like this and is used dur­ing the pro­cess­ing stage:

const widths = {"A": 13, "B": 15, "C": 15, "D": 17 ⋯ "-": 11, "+": 13, " ": 6};

Pro­cess­ing

Pro­cess­ing is done in two steps, called prepare and expand. Both are struc­tured very sim­i­lar­ly, with three func­tions: one acts on rows, the oth­er one on cells with­in a row, and the fi­nal one on the in­di­vid­ual cell. The first two func­tions are very sim­ple. For in­stance, for the prepare step, they are:

const prepareRow = function (input) {
    return input.map(prepareCell);
};

const prepare = function (input) {
    return input.map(prepareRow);
};

The in­ter­est­ing things are hap­pen­ing with­in the third func­tion.

For the first step, the third func­tion mea­sures the width of text, in pix­els, and re­turns an ob­ject con­tain­ing the orig­i­nal text and the width. In or­der to avoid do­ing an ex­tra loop, it also col­lects, in an ar­ray, the in­for­ma­tion about the size used by the widest cell for every col­umn.

const prepareCell = function (input, column) {
    const width = input
        .split('')
        .map(c => widths[c])
        .reduce((s, x) => s + x, 0); 

    if (width > maxWidths[column]) {
        maxWidths[column] = width;
    }   

    return {
        text: input,
        width: width,
    };  
};

The ar­ray con­tain­ing the max­i­mum pix­els by col­umn is in­ter­est­ing, but not very use­ful in its orig­i­nal form. In­stead, one would be more in­ter­est­ed by know­ing which tab should be tar­get­ed dur­ing the sec­ond step, or, more pre­cise­ly, where is the mid­dle of the tar­get tab. This is done with a very sim­ple trans­form:

Math.ceil(pos / tabSize) * tabSize + tabSize / 2

The re­sult is then stored in an ar­ray called columnTargets. Based on this ar­ray, as well as on the struc­ture re­turned by the first step, the sec­ond step adds the re­quired spaces. Noth­ing fan­cy here ei­ther:

const expandCell = function (cell, column) {
    const targeted = columnTargets[column];
    const delta = targeted - cell.width;
    const spaces = Math.round(delta / spaceSize);
    return cell.text + ' '.repeat(spaces);
};

Re­sult: less than one hun­dred lines of code for a re­sult which looks rather nice, to para­phrase the spec. Check it your­self by see­ing the source.