## The new GHC API

December 8th, 2008 by WouterLast Friday we had Thomas Schilling visiting. He talked about his internship at MSR. He has been cleaning up the GHC API. I’ve uploaded an annotated copy of his slides.

abstracting the pain away

Last Friday we had Thomas Schilling visiting. He talked about his internship at MSR. He has been cleaning up the GHC API. I’ve uploaded an annotated copy of his slides.

I talked about my recent attempts to understand call-by-push-value, a typed lambda calculus that is very precise about when evaluation occurs.

The idea underlying CBPV is to **distinguish the type of values and computations**. You can think of a value as a fully evaluated Int, String, pair of values, etc. Functions, on the other hand, are computations abstracting over values. I’ve put together an overview of all the types, terms, typing judgements, and evaluation rules (forgive me, but I’d much rather edit large formulas in latex directly than WordPress)

My motivation for this is very different from Paul Levy’s: I’m dreaming of a language with enough laziness to be interesting, but with more predictable memory usage than Haskell. The next step is to figure out how to add (value) data, and (computational) codata, effects (no IO monad), and enough syntactic sugar that you don’t need to be explicit about every thunk/force. There’s a relationship with Conor’s ideas about Frank and the recent work about focussing and duality at CMU that I haven’t managed to nail it down.

The theory of species was originally conceived by Andre Joyal and is used to analyze combinatorial structures in terms of their *generating functions*. In my talk on Friday I showed how the theory of species can be developed from a functional programming perspective with a view towards using them to analyze datatypes. I am diverging a bit from the classical definitions in defining a generalisation of species which matches with a specialisation of the notion of a container.

A *generalized species* is given by a family , the idea is that is the set of structures which have positions. E.g. consider the species of leaf-labelled binary trees : is the set of the two binary trees, which have leafs (*insert picture here*). Given a species we assign a functor by . The definition corresponds to the generating function, but this is a function on the reals not on sets (see my recent posting on sets and numbers). Note that is just where we identify a natural number with the set of numbers smaller than it (usually called Fin in dependently typed programming).

Category question: If species are the objects, what are the morphisms? Let’s define a category . Given species , a morphism is given by a triple [math](f,g,h)[/math]:

Calculating the numbers of positions.

Calculating the new shape.

Assigning source positions to target positions.

I leave it as an exercise to define identity and composition of those morphisms.

The definition of morphisms hopefully becomes clear when we give their interpretation as natural transformations, i.e. to every species morphism we assign a natural transformation :

.

**Operations on species**: Given species :

- Coproducts are easy:

- Products are a bit more involved:

- Composition is also interesting:

Given a species we can construct its initial algebra .

*to be continued *

To simplify: Mathematics has to do with numbers, while Computer Science is more concerned with types (let’s call them sets). On the level of the natural numbers there is a nice equivalence identfying a finite set with the number of its elements. The operations of addition and multiplication on numbers correspond to coproduct (or disjoint union) and (cartesian) product of sets. Indeed we even use the same symbols and . Exponentiation corresponds to function space , indeed category theoreticians use the exponential notation for the latter.

But then the ways part and the schism begins. For the Mathematicians the natural numbers are only half of the story (indeed a **semi**ring) and they hasten to add the negative numbers. If we want to keep exponentiation, we can’t stop at the integers, we get the rational numbers , algebraic reals and even algebraic complex numbers . The intuitive completion of these constructions are the complex numbers.

Negative sets don’t really make sense, we are more interested in infinite sets. E.g. we would like to have a fixpoint of successor: . Combining this with exponentiation leeds to larger sets like or ordinal notations

. This time the intuitive completion are sets.

But the story may have a happy ending. Once we start to analyze operations on those spaces, a new analogy seems to emerge. E.g. differentiation is well known as an operation on complex functions, but as Conor McBride observed it also makes sense as an operation on functors, i.e. operations on sets. Not on all functors, but then differentiation doesn’t work on all functions either. The analogy carries further, and indeed we can use a variation of Taylor’s theorem to analyze datatypes. Hence, here we seem to have a new union, the glimpse of a generalized calculus applicable to operations in numbers and sets.

One of the areas where this analogy has been exploited is the theory of species, where ideas from calculus are exploited to prove theorems about the combinatorics of structures. To me much of the literature seems to be written from the perspective of Mathematicians who know numbers and try to use them to analyze sets. I’d prefer to do it the other way around, start with sets and see whether we can steal ideas from conventional calculus.

This is basically the introduction to a more technical talk about *generalized species* which I will post separately.

Last week we discussed whether pure languages are inherently less efficient than impure ones, and if that is an interesting question to ask.

In Pure versus impure Lisp Pippenger showed that, under certain assumptions, pure Lisp can be strictly less efficient than impure Lisp. However, as noted by Pippenger, his result assumes certain operational semantics for the two languages. The result was later shown to be inapplicable to a *lazy* pure language (using call-by-need) by Bird, Jones and de Moor.

So, what if we change the operational semantics? By using a smart (impure) implementation of functional (pure) references, due to Baker, one can ensure that, as long as the references are used single-threadedly, all accesses and updates take constant time. (See Section 2.3.1 of A Persistent Union-Find Data Structure for an explanation of Baker’s idea.) Furthermore the heap of a sequential, impure program (with a standard semantics) is always used single-threadedly. Hence, if we incorporate Baker’s idea into the operational semantics of a pure language, then impure sequential programs can always be translated into equivalent pure ones, which execute with only a constant overhead in time.

However, this translation does not really give us much; understanding the translated program is no easier than understanding the original one, for instance. The question I am really interested in is to what degree programs written in a pure (functional) *style* are less efficient than those which use an impure style. Unfortunately this question seems hard to formulate in a good way, because “pure style” and “impure style” are rather subjective notions. Furthermore one still has the problem of choosing suitable operational semantics. Finally, as pointed out by Thorsten Altenkirch, a programming language can support semantics-preserving program annotations which the programmer can use to guide the compiler; this in effect enables tweaking the operational semantics on a case-by-case basis.

I presented a simple derivation of Paul’s construction of the exponential of containers. I used the opportunity to discuss the exponential of functors and the Yoneda lemma. My derivation is based on the observation that writing for the exponentiation of functors and is a *Napieran functor*.

This can be shown using the Yoneda lemma. Then given a container and a functor we can reason as follows:

This shows that one can exponentiate any functor with a container (predicatively) and that exponent of containers is a container, since we know that is a container and containers are closed under composition and products. Expanding the definition gives rise to Paul’s definition.

Monad are an essential tool for functional programmers. When it comes to *combining* monads, the most popular technique is to use monad transformers to add an effect layer to an existing monad. Do you want a new state for your monad? Use the state monad transformer on your monad and voila! it’s done!

Actually, you are not really done. Monads usually come with a set of operations to manipulate the effect they model. For example, the state monad comes with operations for reading and updating the state (`get`

and `put`

), and the environment monad (`Reader`

in Haskell) comes with operations for reading the environment and performing a computation in a modified environment (`ask`

and `local`

). If you want the operations of your monad to be supported by the transformed monad, then you need to show how to lift the operations to the transformed monad. Unless a technology is used for automating these liftings, you need to provide an operation instance for every monad transformer in the library, leading to operation instances. That’s not modular!

The problem is not in monad transformers (other technologies for combining monads such as distributive laws or coproduct of monads suffer from the same problem). The problem is that **many of the operations associated with a monad are not really compatible with the monad**.

Monads have axioms that ensure that `return`

and `bind`

are well-behaved, so it is not unreasonable to think that there should be some axioms associated to its operations. So I propose to require operations to be algebraic. In our context this means that an operation `h`

for a monad `M`

should be of type `forall a . S (M a) -> M a`

for some functor `S`

, such that:

`join . h = h. fmap join`

where `join`

is the multiplication of the monad `join = (>>= id)`

.

For example, `uncurry (++) :: ([a],[a]) -> [a]`

is algebraic for the list monad, with `S a = (a,a)`

. On the other hand, `uncurry local :: (r->r, Reader r a) -> Reader r a`

has the correct type, with functor `S a = (r -> r, a)`

, but is not algebraic, as the condition does not hold. The operation `ask :: Reader r r`

doesn’t seem to have the correct type, but it is isomorphic to a function `ask' :: (r -> Reader r a) -> Reader r a`

, which is algebraic for `Reader r`

with `S a = r -> a`

. It’s convenient to consider algebraic operations up to isomorphism. Then, `ask`

and `(++)`

are algebraic.

So, what’s good about algebraic operations? They can be lifted by **any** monad morphism! Every algebraic operation is isomorphic to a natural transformation in the Kleisli category of the monad.

This means that for every algebraic operation `h :: forall a. S (M a) -> M a`

, monad `N`

, and monad morphism `lift :: forall a. M a -> N a`

, we get an algebraic operation `fromK (lift. toK h) :: forall a. S (N a) -> N a`

, i.e. the lifting of `h`

to `N`

.

Append is algebraic for the list monad, so we get a “liftable” version of append: for every monad morphism out of the list monad, we get a lifted operation.

Of course, not all the usual operations are algebraic for their monad. This does not mean that the operation is incorrect, but that the monad should be more powerful. **The operations determine the appropriate monad**.

The operation `local`

is not an algebraic for `Reader`

but can become algebraic if we implement the environment monad with `State`

:

Since `local'`

is algebraic for `Reader'`

, we get a liftable operation `localL`

. The apparently arbitrary move from `Reader`

into `State`

seems like a big eureka step. Actually, it’s the result of applying a general technique for finding appropriate monads for a given set of operations (paper in process).

Using these ideas, I managed to re-implement the monad transformer library with automatic lifting of operations. I will release the library as soon as I write a minimum of documentation.

Last friday, I talked about what I did during the second part of my internship. I explained an Agda implementation of hereditary substitutions over simply-typed λ-calculus from Thorsten Altenkirch, basing on ideas from Conor McBride and Andreas Abel.

Hereditary substitutions are substitutions over typed λ-calculus that are structurally recursive. It provides an algorithm for normalization whose termination can be proved by a simple lexicographical induction. The main idea of this algorithm is to simultaneously substitute and re-normalize.

We will restrain to simply typed λ-calculus and use de Bruijn notations. We will also use a notation from Conor: if Γ is a context and x a variable in Γ, then Γ-x is the context Γ without the variable x.

Normal forms in simply typed λ-calculus are terms that are as much as possible β-reduced and η-expanded. There are two ways of constructing them:

- λn : Nf (Γ,σ) τ -> Nf Γ (σ → τ)
- ne : Var Γ σ -> Sp Γ σ ο -> Nf Γ ο

where spines are lists of normal forms:

- ε : Sp Γ σ σ
- _,_ : Nf Γ σ -> Sp Γ τ ρ -> Sp Γ (σ → τ) ρ

We want to define hereditary substitutions over normal forms:

_ [ _ := _ ] : Nf Γ σ -> (x : Var Γ τ) -> Nf (Γ-x) τ -> Nf (Γ-x) σ

in parallel with substitutions over spines:

_ { _ := _ } : Sp Γ σ τ -> (x : Var Γ ρ) -> Nf (Γ-x) ρ -> Sp (Γ-x) σ τ

and application of a normal form to a spine:

_ @ _ : Nf Γ σ -> Sp Γ σ τ -> Nf Γ τ

The definitions are the following ones:

_ [ _ := _ ]

- (λn t)[x := u] = λn t[x+ := u+]
- (ne y ts)[x := u] = ne y’ ts{x := u} where y’+x == y if x ≠ y
- (ne x ts)[x := u] = u @ ts{x := u}

_ { _ := }

- ε{x := u} = ε
- (t , ts){x := u} = t[x := u] , ts{x := u}

_ @ _

- u @ ε = u
- (λn u) @ (t , ts) = u[ø := t] @ ts

This is structurally recursive and terminates: for measures, we will consider (type of x, length of the normal form) for _ [ _ := _ ]; (type of x, length of the spine) for _ { _ := _ }; and (type of u) for _ @ _. It is obvious to see that when each function calls itself and when _ [ _ := _ ] and _ { _ := _ } call each other, the measures strictly decrease (for the lexicographical order).

The problem is when _ [ _ := _ ] and _ @ _ call each other. In that case, we have:

- (ne x (t , ts))[x := λn u] = (λn u) @ (t’ , ts’) where (t’ , ts’) == (t , ts){x := λn u}
- (λn u) @ (t’ , ts’) = (u[ø := t']) @ ts’

When (ne x (t , ts))[x := λn u] calls _ @ _, x and (λn u) have type (σ → τ). When (λn u) @ (t’ , ts’) calls _ [ _ := _ ], ø has type σ, and σ < σ → τ; and (u[ø := t']) has type τ, and τ < σ → τ.

Once we have established hereditary substitutions, we can normalize terms:

- nf (var v) = nf’ v ε where:

- nf’ v ts = ne v ts if ts : Sp Γ σ ο
- nf’ v ts = λn nf’ v (ts+,(nf’ ø ε)) if ts : Sp Γ σ (τ → ρ)

- nf (λ t) = λn (nf t)
- nf (app t u) = t’[ø := nf u] where nf t == λn t’

In reverse, we can easily define an embedding function <_> that transforms a normal form into a term.

During my trainee, I intended to prove three properties:

- t ≈ < nf t > (completeness)
- t ≈ u => nf t == nf u (soundness)
- nf <t> == t (stability)

I nearly managed to perform these proofs, nonetheless I miss one proof to complete it: the fact that η-equality stands for normal forms:

λn (app’ t (nf’ ø ε)) == t where t has an arrow type and app’ (λn t) u = t[ø := u]

Proving this is similar to prove: (t+1)[ø := (nf' ø ε)] == t. Nils previously performed this proof but in another framework, and we did not see how to prove this property in the framework exposed above.

Last week we talked about methods for defining productive functions in total languages.

In EWD 792 Dijkstra defines (a slight variant of) the stream of Hamming numbers roughly as follows (here I use Haskell syntax, and `merge`

merges two ordered streams into one):

hamming = 1 : merge (map (2 *) hamming) (map (3 *) hamming)

(This definition is attributed to van de Snepscheut.) However, this definition is not guarded by constructors. How, then, can we implement it in a language like Agda, which only allows definitions which are either structurally recursive or guarded by constructors (with some extensions)?

Read the rest of this entry »

Paul Levy seems to have taken an interest in the category of containers, and in particular in whether or not it is a CCC. (Like [Set,Set], in which it faitfully and fully embeds.)

In his thesis, Pierre Hyvernat proved that (interaction structures, which are essentially) indexed containers provide a model of multiplicative linear logic, which is to say tensor (><), lolly (-o) and bang (!). A while ago, we "back-ported" at least tensor and lolly to the category of (plain 'ole) containers. I'm not 100% sure, but I think we backported the bang. (The details are in the containers blog, which seems to be defunct.) It would follow that if we define the exponential of containers so that (A -> B) = (!A)-oB, we’d get what ought to be the exponential. Some slightly inebriated investigations last night persuaded me that it could well work out. The question arises of relating this construction to Paul L’s one — it looks prima facie

a bit different. Paul’s centres round partial functions. Mine centres round bags.

The discussion was mostly about the details of the bang construction, eg. the right way to deal with multisets constructively. (Pierre used finitely indexed families modulo permutations of the index set.) Thorsten had a vague memory that decidability of equality was needed somewhere. Indeed, this rings some kind of bell. The moral was that one should try to stay out of the pub long enough to nail this stuff down.