## Is purity inefficient?

October 17th, 2008 by Nils Anders DanielssonLast week we discussed whether pure languages are inherently less efficient than impure ones, and if that is an interesting question to ask.

In Pure versus impure Lisp Pippenger showed that, under certain assumptions, pure Lisp can be strictly less efficient than impure Lisp. However, as noted by Pippenger, his result assumes certain operational semantics for the two languages. The result was later shown to be inapplicable to a *lazy* pure language (using call-by-need) by Bird, Jones and de Moor.

So, what if we change the operational semantics? By using a smart (impure) implementation of functional (pure) references, due to Baker, one can ensure that, as long as the references are used single-threadedly, all accesses and updates take constant time. (See Section 2.3.1 of A Persistent Union-Find Data Structure for an explanation of Baker’s idea.) Furthermore the heap of a sequential, impure program (with a standard semantics) is always used single-threadedly. Hence, if we incorporate Baker’s idea into the operational semantics of a pure language, then impure sequential programs can always be translated into equivalent pure ones, which execute with only a constant overhead in time.

However, this translation does not really give us much; understanding the translated program is no easier than understanding the original one, for instance. The question I am really interested in is to what degree programs written in a pure (functional) *style* are less efficient than those which use an impure style. Unfortunately this question seems hard to formulate in a good way, because “pure style” and “impure style” are rather subjective notions. Furthermore one still has the problem of choosing suitable operational semantics. Finally, as pointed out by Thorsten Altenkirch, a programming language can support semantics-preserving program annotations which the programmer can use to guide the compiler; this in effect enables tweaking the operational semantics on a case-by-case basis.

October 17th, 2008 at 4:24 pm

Hi Nils,

Thanks for an interesting post! I find this is a very interesting question, for the following reason. If you consider the linear lambda calculus, then there’s no question about implementing it efficiently, because the linear use criterion means that you can always safely do destructive updates. It’s only at exponential types !A that destructive updates become unsafe. In other words, update becomes unsafe once we permit sharing or reuse of values.

That’s well-known, almost a commonplace, but now consider the flip side: efficient algorithms go out of their way to avoid re-doing work, and one of the most important mechanisms for doing this — is sharing! For example, compiler optimizations often work with truly hairy graph representations to turn naively cubic algorithms into n-log-n ones. And of course you pointed out union-find, which is the queen of algorithms that exploit sharing.

So in a vague, handwavy way it seems like you get a choice of two ways to use sharing, but you can’t (easily?) use both at once. It would be incredibly cool to learn of some nice logical explanations of this tradeoff. Perhaps the work on light affine logic has something to say here?

December 2nd, 2009 at 7:25 pm

The claim that the overhead incurred by employing Find-Union is constant seems wrong (otherwise please do correct me). The best known structure is slightly short of it, performing Union in constant time and Find in time measured by the inverse of the Ackermann function.

January 27th, 2010 at 4:46 pm

> The claim that the overhead incurred by employing Find-Union is constant seems wrong (otherwise please do correct me).

I have not made such a claim.