Side effects

Arseni Mourzenko
Founder and lead developer
177
articles
August 21, 2021
Tags: computer-science 1

Re­cent­ly, I had a dis­cus­sion with a col­league about the no­tion of side ef­fects. I was talk­ing about the func­tion­al ap­proach­es to work with ar­rays, and list­ed three func­tions: map, filter, and reduce. As my col­league men­tioned a fourth one, forEach, I told that it shouldn't be put in the same cat­e­go­ry with the pre­vi­ous three func­tions, as forEach im­plies the pres­ence of side ef­fects—while it can be used with an ex­pres­sion free of side ef­fects, it is then per­fect­ly use­less to call it this way: call­ing it or not would make no dif­fer­ence.

My col­league ar­gued that forEach makes per­fect sense in a con­text of pure func­tions, giv­ing the fol­low­ing ex­am­ple:

this.forEach(yesno => yesno.toggle());

Af­ter all, he said, this call doesn't change any glob­al state or any­thing like that—all it does it to change the lo­cal state of the con­text ob­ject it­self. While it looks valid at the first sight, such ar­gu­ment is wrong. This ar­ti­cle pro­vides two ways to for­mal­ly prove that.

1 Proof us­ing side ef­fects prop­a­ga­tion log­ic

1.1 Defin­ing side ef­fect

Wikipedia ar­ti­cle on side ef­fects starts with a de­f­i­n­i­tion of a side ef­fect, and a few ex­am­ples. In­stead of quot­ing the ar­ti­cle, I would rather put a quote of its orig­i­nal source, Spuler, David A.; Sajeev, A. S. M. (Jan­u­ary 1994). Com­pil­er De­tec­tion of Func­tion Call Side Ef­fects. James Cook Uni­ver­si­ty. Cite­SeerX 10.1.1.70.2096:

The term Side ef­fect refers to the mod­i­fi­ca­tion of the non­lo­cal en­vi­ron­ment. Gen­er­al­ly this hap­pens when a func­tion (or a pro­ce­dure) mod­i­fies a glob­al vari­able or ar­gu­ments passed by ref­er­ence pa­ra­me­ters. But here are oth­er ways in which the non­lo­cal en­vi­ron­ment can be mod­i­fied. We con­sid­er the fol­low­ing caus­es of side ef­fects through a func­tion call:

  1. Per­form­ing I/O.
  2. Mod­i­fy­ing glob­al vari­ables.
  3. Mod­i­fy­ing lo­cal per­ma­nent vari­ables (like sta­t­ic vari­ables in C).
  4. Mod­i­fy­ing an ar­gu­ment passed by ref­er­ence.
  5. Mod­i­fy­ing a lo­cal vari­able, ei­ther au­to­mat­ic or sta­t­ic, of a func­tion high­er up in the func­tion call se­quence (usu­al­ly via a point­er).

The rel­e­vant part here is the mod­i­fi­ca­tion of an ar­gu­ment passed by ref­er­ence. So:

function isEmpty(arr) {
    return arr.length === 0;
}

does not have side ef­fects, be­cause it only ac­cess­es the ar­gu­ment, with­out mod­i­fy­ing it. Where­as:

function appendZero(arr) {
    arr.push(0);
}

has a side ef­fect, since it al­ters the orig­i­nal arr.

Our task now is to de­ter­mine whether in the pre­vi­ous forEach ex­am­ple, some­thing is al­ter­ing its ar­gu­ments.

1.2 Prop­a­ga­tion of side ef­fects

Be­fore go­ing back to the orig­i­nal ex­am­ple, it would be use­ful to es­tab­lish how side ef­fects prop­a­gate from func­tion to func­tion. Take the fol­low­ing func­tion:

function greet(name) {
    console.log(`Hello, ${name}!`);
}

Ob­vi­ous­ly, this one has side ef­fects, as it per­forms I/O. Now, let's con­sid­er this:

function greetGeorge() {
    greet('George');
}

In or­der to de­ter­mine whether this func­tion has side ef­fects or not, it is nec­es­sary to find whether the func­tion it calls has side ef­fects. In this case, greet per­forms I/O, there­fore greetGeorge can be con­sid­ered as per­form­ing I/O. In oth­er words:

Prop­er­ty 1:
A func­tion is con­sid­ered to have side ef­fects when at least one of its ex­e­cu­tion paths calls one or more func­tions which have side ef­fects which go be­yond the call­ing func­tion.

Now, let's rewrite greetGeorge like this:

function greetGeorge2(action) {
    action('George');
}

The ac­tion be­ing passed as a pa­ra­me­ter now, there is no pos­si­ble way to de­ter­mine whether greetGeorge has side ef­fects or not with­out con­sid­er­ing the ac­tu­al ar­gu­ments it re­ceives. I would sug­gest there­fore a rule phrased like that:

Prop­er­ty 2:
A func­tion ex­e­cut­ing the ac­tions which it re­ceives as ar­gu­ments is con­sid­ered to have side ef­fects with­in a giv­en ex­e­cu­tion con­text where one or more ac­tions have side ef­fects.

In oth­er words, I wouldn't con­sid­er greetGeorge2 to in­her­ent­ly have side ef­fects; in­stead, its prop­er­ty of pro­duc­ing side ef­fects de­pends en­tire­ly on its ac­tu­al ex­e­cu­tion.

1.3 De­com­pos­ing the code

In the orig­i­nal forEach ex­am­ple, there is one ex­pres­sion, one func­tion, and one pro­to­type:

  1. The en­tire ex­pres­sion this.forEach(yesno => yesno.toggle()).
  2. The func­tion yesno => yesno.toggle().
  3. The pro­to­type this.forEach(action).

To make things clear­er, let's rewrite the func­tion yesno => yesno.toggle() like this:

function toggleIt(yesno) {
    yesno.toggle();
}

Is it free of side ef­fects? Based on Prop­er­ty 1 above, this would de­pend on what toggle does. I would guess that toggle it­self is im­ple­ment­ed sim­i­lar­ly to this:

YesNo.prototype.toggle = function () {
    this.underlyingValue != this.underlyingValue;
}

Here, toggle is a pro­to­type. A pro­to­type is not a func­tion: in­stead, here, it would be sim­i­lar to a method in lan­guages such as Java, C#, or Python, i.e. it doesn't live in a vac­u­um, but has a con­tex­tu­al ob­ject at­tached to it, re­ferred with­in the pro­to­type by the key­word this. Func­tions, by their na­ture, don't have con­tex­tu­al ob­jects: they have a knowl­edge of the world around them through the ar­gu­ments passed to them by the caller. Python's way to write meth­ods is quite ex­plic­it: un­like many oth­er lan­guages which hide this, Python leaves it to the pro­gram­mer to write it ex­plic­it­ly as a first pa­ra­me­ter:

class StorageApi:
    ...
    def _get_uri(self, name):
        encoded_name = urllib.parse.quote_plus(name)
        return f"{self.storage_api}values/{self._domain}/{encoded_name}"

Back to the toggle pro­to­type, it would look like this in a form of a func­tion:

const toggle = function (self) {
    self.underlyingValue != self.underlyingValue;
}

Now, it is clear that it has side ef­fects: it takes a mu­ta­ble ar­gu­ment passed by ref­er­ence, and mod­i­fies it. There­fore toggleIt has side ef­fects, and so does its orig­i­nal vari­ant yesno => yesno.toggle().

What's with the pro­to­type this.forEach(action)? Since in the cur­rent case, action al­ways has side ef­fects, and since forEach al­ways ex­e­cutes the action, one can con­clude, based on the pre­vi­ous­ly stat­ed Prop­er­ty 2, that in this spe­cif­ic script, forEach (al­ways) has side ef­fects. Since this.forEach(yesno => yesno.toggle()) calls forEach, it too has side ef­fects, giv­en Prop­er­ty 1.

2 Proof us­ing ref­er­en­tial trans­paren­cy

An easy way to find whether a func­tion has side ef­fects or not is to make every­thing con­stant and im­mutable, and then imag­ine a giv­en func­tion be­ing ex­e­cut­ed.

Let's take an ex­am­ple of a code with no side ef­fects:

function product(arr) {
    return arr.reduce((acc, elem) => acc * elem, 1);
}

In a per­fect­ly im­mutable con­text, this func­tion would be­have the ex­act same way it does oth­er­wise: it will take an im­mutable ar­ray, and re­turn the prod­uct of all the num­bers it finds in­side it.

Now what about the orig­i­nal:

this.forEach(yesno => yesno.toggle());

As soon as you call it, it will fail. As yesno is im­mutable, try­ing to change its underlyingValue will throw a TypeError ex­cep­tion (at least in strict mode).

In short, if a func­tion can work only in a con­text where at least one of the sur­round­ing ob­jects is mu­ta­ble, then the func­tion has side ef­fects.

Haskell, for in­stance, makes a clear sep­a­ra­tion be­tween pure func­tions, and every­thing else—this every­thing else tak­ing a form of monadic ac­tions. Pure func­tions not only lack side ef­fects, but also al­ways re­turn the same val­ue giv­en the same ar­gu­ments. The pu­ri­ty helps rea­son­ing about the code, and it cer­tain­ly helps the com­pil­er when op­ti­miz­ing it. For in­stance, a re­sult of a pure func­tion can be cached, with no need for cache in­val­i­da­tion.

The prop­er­ty where the ac­tu­al call to a func­tion can be re­placed by a val­ue that the func­tion re­turned be­fore is called ref­er­en­tial trans­paren­cy.

Let's imag­ine for a mo­ment that this.forEach(yesno => yesno.toggle()) is pure. It would mean that the whole ex­pres­sion can be re­placed by undefined—this is the val­ue re­turned by forEach. It is ob­vi­ous, how­ev­er, that do­ing so would break the pro­gram, as the un­der­ly­ing val­ue of yesno of the el­e­ments of the ar­ray would not change any longer. There­fore, the ex­pres­sion does not have ref­er­en­tial trans­paren­cy, and so is not pure.

A pure func­tion has two prop­er­ties:

Since the ex­pres­sion this.forEach(yesno => yesno.toggle()) was proven to be not pure, it means that it lacks at least one of the two prop­er­ties. Could it be the sec­ond one? Not re­al­ly: forEach would al­ways re­turn undefined, which im­plies that it al­ways re­turns the same val­ue giv­en the same ar­gu­ments. So it is nec­es­sar­i­ly the first prop­er­ty that it lacks: in oth­er words, it has side ef­fects.

3 Call it twice

One read­er of this blog told me that there is an easy way to show that the dis­cussed ex­pres­sion has side ef­fects.

One re­mark­able prop­er­ty of func­tions which have no side ef­fects is that they can be called sev­er­al times, with­out any con­se­quence. The re­turn val­ue may po­ten­tial­ly dif­fer (un­less the func­tion is also pure), but there should be no oth­er con­se­quences.

Here, it doesn't take long to see that call­ing this.forEach(yesno => yesno.toggle()) twice would screw things up. The first call will swap the boolean val­ues; the sec­ond one will can­cel the ef­fect of the first. This means that by no means the ex­pres­sion can be con­sid­ered to have no side ef­fects.

In­deed, while not be­ing a for­mal proof, the “call it twice and see what hap­pens” ap­proach is large­ly enough to high­light the side ef­fects in the cur­rent sit­u­a­tion.