## Hereditary substitutions over simply typed λ-calculus

August 13th, 2008 by Chantal KellerLast friday, I talked about what I did during the second part of my internship. I explained an Agda implementation of hereditary substitutions over simply-typed λ-calculus from Thorsten Altenkirch, basing on ideas from Conor McBride and Andreas Abel.

Hereditary substitutions are substitutions over typed λ-calculus that are structurally recursive. It provides an algorithm for normalization whose termination can be proved by a simple lexicographical induction. The main idea of this algorithm is to simultaneously substitute and re-normalize.

We will restrain to simply typed λ-calculus and use de Bruijn notations. We will also use a notation from Conor: if Γ is a context and x a variable in Γ, then Γ-x is the context Γ without the variable x.

Normal forms in simply typed λ-calculus are terms that are as much as possible β-reduced and η-expanded. There are two ways of constructing them:

- λn : Nf (Γ,σ) τ -> Nf Γ (σ → τ)
- ne : Var Γ σ -> Sp Γ σ ο -> Nf Γ ο

where spines are lists of normal forms:

- ε : Sp Γ σ σ
- _,_ : Nf Γ σ -> Sp Γ τ ρ -> Sp Γ (σ → τ) ρ

We want to define hereditary substitutions over normal forms:

_ [ _ := _ ] : Nf Γ σ -> (x : Var Γ τ) -> Nf (Γ-x) τ -> Nf (Γ-x) σ

in parallel with substitutions over spines:

_ { _ := _ } : Sp Γ σ τ -> (x : Var Γ ρ) -> Nf (Γ-x) ρ -> Sp (Γ-x) σ τ

and application of a normal form to a spine:

_ @ _ : Nf Γ σ -> Sp Γ σ τ -> Nf Γ τ

The definitions are the following ones:

_ [ _ := _ ]

- (λn t)[x := u] = λn t[x+ := u+]
- (ne y ts)[x := u] = ne y’ ts{x := u} where y’+x == y if x ≠ y
- (ne x ts)[x := u] = u @ ts{x := u}

_ { _ := }

- ε{x := u} = ε
- (t , ts){x := u} = t[x := u] , ts{x := u}

_ @ _

- u @ ε = u
- (λn u) @ (t , ts) = u[ø := t] @ ts

This is structurally recursive and terminates: for measures, we will consider (type of x, length of the normal form) for _ [ _ := _ ]; (type of x, length of the spine) for _ { _ := _ }; and (type of u) for _ @ _. It is obvious to see that when each function calls itself and when _ [ _ := _ ] and _ { _ := _ } call each other, the measures strictly decrease (for the lexicographical order).

The problem is when _ [ _ := _ ] and _ @ _ call each other. In that case, we have:

- (ne x (t , ts))[x := λn u] = (λn u) @ (t’ , ts’) where (t’ , ts’) == (t , ts){x := λn u}
- (λn u) @ (t’ , ts’) = (u[ø := t']) @ ts’

When (ne x (t , ts))[x := λn u] calls _ @ _, x and (λn u) have type (σ → τ). When (λn u) @ (t’ , ts’) calls _ [ _ := _ ], ø has type σ, and σ < σ → τ; and (u[ø := t']) has type τ, and τ < σ → τ.

Once we have established hereditary substitutions, we can normalize terms:

- nf (var v) = nf’ v ε where:

- nf’ v ts = ne v ts if ts : Sp Γ σ ο
- nf’ v ts = λn nf’ v (ts+,(nf’ ø ε)) if ts : Sp Γ σ (τ → ρ)

- nf (λ t) = λn (nf t)
- nf (app t u) = t’[ø := nf u] where nf t == λn t’

In reverse, we can easily define an embedding function <_> that transforms a normal form into a term.

During my trainee, I intended to prove three properties:

- t ≈ < nf t > (completeness)
- t ≈ u => nf t == nf u (soundness)
- nf <t> == t (stability)

I nearly managed to perform these proofs, nonetheless I miss one proof to complete it: the fact that η-equality stands for normal forms:

λn (app’ t (nf’ ø ε)) == t where t has an arrow type and app’ (λn t) u = t[ø := u]

Proving this is similar to prove: (t+1)[ø := (nf' ø ε)] == t. Nils previously performed this proof but in another framework, and we did not see how to prove this property in the framework exposed above.

August 13th, 2008 at 6:31 pm

Hi Chantal and Nottingham folks,

I know this theorem is true—in the sense that I proved it at some point, another student Arbob Ahmad proved it at some point, it’s buried in the Twelf code for the Lee-Crary-Harper type safety proof for SML, etc… Unfortunately I don’t think anyone has a nice PDF write-up for you, though.

I think you need to prove the following three lemmas mutually recursively:

(1) M [ x := eta x ] = M

(2) (eta x) @ S = ne x S (where S : Sp Gamma A o is a full spine)

Generalization to get the induction to go through:

(eta x S0) @ S = ne x (append S0 S)

(3) (eta y) [ y := M ] = M

Here I’m writing M for a normal form, S for a spine, and eta for what you called nf’ above, where (eta x) means (eta x epsilon). The proof is by the usual lexicographic induction on a type and a term that makes hereditary substitution tick.

Arbob and I just went through a few cases on the board to remind ourselves of these lemmas, but we didn’t check them all, so we might be missing some of the details.

Feel free to e-mail me if you try this proof and get stuck and want to work through it, or let me know if it works out!

-Dan

August 13th, 2008 at 8:37 pm

OK, I just worked through the details, and I think the proof works.

I needed to generalize (3) to

(3′) (eta y S) [ y := M ] = M @ (S { y := M })

to get the induction to go through in the -> case.

One might also break out

(4) M : A -> B = \z.M@(eta z)

as a separate lemma used to prove (3′), but this is just an easy (non-inductive) wrapper around (1).

As far as termination goes, (1) calls (2), (2) calls (3), and (3) calls back to (1), all with the type getting smaller. And each part is locally inductive on the type/term that gets the functions to fire.

I also needed the following lemmas (proved beforehand; not part of the loop):

[for part 2] (eta y S) [ x := M ] = eta y (S { y := M })

when x =/= y

[for part 3] (M @ S1) @ S2 = M @ (S1 ; S2)

I’ll ask around and see if anyone knows a nicer way to do this—the generalizations in (2) and (3) seem a little convoluted.

August 13th, 2008 at 9:54 pm

Hi Chantal,

Dan pointed me to your blog post. I’ve done the proof of the relevant theorem for my work on LFR, an extension of LF with refinement types, in which working with only canonical forms has proven essential and led to many new insights.

I have a complete write up of the appropriate generalization and the proof in my LFR technical report; see http://www.cs.cmu.edu/~wlovas/papers/lfr-tr.pdf, Lemma 3.21. It follows roughly the same train of thought as Dan’s comments above, except that i was working with “spineless” terms, where the head of an application is not exposed.

Also worth mentioning: Frank Pfenning wrote up a great deal of the work on canonical forms and hereditary substitutions for the simply-typed case in a paper for the Peter Andrews festschrift. The paper is called “Church and Curry: Combining Intrinsic and Extrinsic Typing” and is available at http://www.cs.cmu.edu/~fp/papers/andrews08.pdf .

Hope these pointers are helpful! Best wishes.

William

August 15th, 2008 at 3:00 pm

So, it turns out I was wrong about no one having a nice PDF write-up for you. Not only have William and Frank written out the proof for spineless form, but Jason Reed has a PDF with the proof for spine form (the representation you’re using here):

http://www.cs.cmu.edu/~jcreed/papers/eta-expansion.pdf

It looks like Jason used the same set of lemmas that I came up with above.

Enjoy!

-Dan

August 25th, 2008 at 1:30 pm

Hello Dan and William,

I thank you for your suggestions, but unfortunately I don’t think this will solve my problems.

Indeed, I was not clear about the way I use the minus notation from Conor. It works as follows:

Gamma : Context ; x : Var Gamma sigma

————————————————–

Gamma – x : Context

with the following definition:

- (Gamma,sigma) – 0 = Gamma

- (Gamma,sigma) – (vs x) = (Gamma – x),sigma

It allows us to define the substitutions properly that way:

t : Nf Gamma sigma ; x : Var Gamma tau ; u : Nf (Gamma – x) tau

——————————————————————————-

t[x := u] : Nf (Gamma – x) sigma

So you must be in a context where x has been removed. To come back to what we want to prove:

(t+1)[0 := eta 0] == t

the two “zeros” do not stand for the same variable : indeed, the second zero must be in a context where the first zero does not exist ! If the first zero has type Var (Gamma,sigma,tau) then the second one has type Var (Gamma,sigma) sigma.

As a result, the lemma:

M[x := eta x]

won’t solve the problem.

I hope I was clearer, the minus notation leads sometimes to tricky things.

Thank you!

Chantal.

August 26th, 2008 at 2:56 pm

You’re right: I was being a little too slick by writing

M [ x := eta x ] = M

This statement relies on tacitly alpha-converting so that the variable you’re substituting for (the first x) is the same as the variable that you’re substituting in (the second x). But these are definitely two different variables, one a pointer into (Gamma,x) and one a pointer into Gamma.

A more honest way of stating the lemma (which Jason uses in his notes) is

M [ x := eta y ] = [y/x]M

where [y/x]M is a simple variable-for-variable substitution (just go through and replace one variable with the other—no fancy hereditary stuff).

Is that theorem statement easier to translate to your de Bruijn representation?

September 15th, 2009 at 1:55 pm

Hi Dan, Williams and FP-lunch bloggers,

Sorry to answer so late, I did not forget you, but I have just found the

time to find a solution and to experiment it.

You were completely right in the way of proving my lemma, and the proof

from Jason for instance well establishes how to prove the property. What

was really difficult was to translate it into my syntax:

* using De Bruijn indices instead of named variables;

* not first defining the terms then the typing rules, but using dependant

types to directly represent well-typed terms.

I recall that we wanted to establish the following property:

(t+1)[ø := (nf' ø ε)] == t

where t is a normal form.

We cannot directly prove this, because it would be an induction over t

and in the λn case, we would have to prove:

(t+2)[1 := (nf' 1 ε)] == t

Our first idea (with Thorsten) was to generalise it as follows:

(t+i+1)[i := (nf' i ε)] == t

where t is a normal form and i is any variable. That would solve the

problem in the λn case during the induction. And this is what is

proposed in your comments in the case we have named variables.

The problem is that with our syntax, this expression simply does not

typecheck. Indeed, in this expression, i represents a number (a De

Bruijn index) which stands. And with our directly typed syntax, we cannot express index

if they do not represent concrete variables.

It took me time to find a solution to this problem not changing our

syntax (or using an intermediate syntax that would not be typed). My

idea is the following one: I generalise in this lemma:

(t+i)[j := (nf' k ε)] == | p > t

where i, j and k are three _different_ variables, plus some hypotheses

about i, j and k, that would reflect the fact that “i-1″, j and k are

the same De Bruijn indices (but not the same variable because not in the

same contexts). These hypotheses are tricky to find. | p > t is just a

way to change the context in which t is typed : if t is typed in Γ1 and

p is a proof that Γ1 == Γ2 then | p > t is the same as t but in context

Γ2.

This expression typechecks, and is provable in both λn and ne cases (not

easily in this latter).

I hope I was clear, let me know what you think of that!

The source code is available here:

.

But it is not easy to read, even if I tried to but comments about what

the lemmas prove and to put together similar lemmas. The definitions of

hereditary substitutions are in the file [hsubst.agda] and the main

lemmas in the file [proofs.agda].

Chantal.

September 15th, 2009 at 1:59 pm

The place where one can find the source code does not appear. Here it is:

http://perso.ens-lyon.fr/chantal.keller/Documents-recherche/Stage08/Hereditary-substitution/hereditary-substitutions.tar.gz