An ad-hoc approach to productive definitions

August 1st, 2008 by Nils Anders Danielsson

Last week we talked about methods for defining productive functions in total languages.

In EWD 792 Dijkstra defines (a slight variant of) the stream of Hamming numbers roughly as follows (here I use Haskell syntax, and merge merges two ordered streams into one):

hamming = 1 : merge (map (2 *) hamming) (map (3 *) hamming)

(This definition is attributed to van de Snepscheut.) However, this definition is not guarded by constructors. How, then, can we implement it in a language like Agda, which only allows definitions which are either structurally recursive or guarded by constructors (with some extensions)?

One option is to follow something like Gianantonio and Miculan’s A Unifying Approach to Recursive and Co-recursive Definitions. That approach is rather heavy-weight, though. In many cases a simpler but more ad-hoc approach is possible: Make the definition constructor-guarded by turning the offending functions into constructors of some suitable type.

In this case one suitable type is StreamProg, the type of stream programs:

mutual
  codata Stream′ A : Set1 where
    _≺_ : (x : A) (xs : StreamProg A) -> Stream′ A

  data StreamProg (A : Set) : Set1 where
    ↓_      : (xs : Stream′ A) -> StreamProg A
    map     : forall {B}
              (f : B -> A) (xs : StreamProg B) -> StreamProg A
    merge   : (cmp : A -> A -> Ordering)
              (xs : StreamProg A) (ys : StreamProg A) ->
              StreamProg A

Note that StreamProg is inductive, whereas Stream′ is coinductive. Using this type we can define the Hamming stream program (for some suitable comparison function cmp; ↓_ has low precedence):

hamming : StreamProg ℕ
hamming ~ ↓ 1 ≺ merge cmp (map (_*_ 2) hamming)
                          (map (_*_ 3) hamming)

This is a coinductive definition of an inductive value. Agda allows this definition because the corecursive calls are guarded by constructors, at least one of which belongs to a coinductive type.

Now, this is not useful unless we can turn stream programs into actual streams:

codata Stream A : Set where
  _≺_ : (x : A) (xs : Stream A) -> Stream A

But we can. The inductive nature of StreamProg ensures that we can always extract the head of the stream in a finite number of steps, along with a stream program corresponding to the tail:

P⇒′ : forall {A} -> StreamProg A -> Stream′ A
P⇒′ (↓ xs)            = xs
P⇒′ (map f xs)        with P⇒′ xs
P⇒′ (map f xs)        | x ≺ xs′ = f x ≺ map f xs′
P⇒′ (merge cmp xs ys) with P⇒′ xs | P⇒′ ys
P⇒′ (merge cmp xs ys) | x ≺ xs′ | y ≺ ys′ with cmp x y
P⇒′ (merge cmp xs ys) | x ≺ xs′ | y ≺ ys′ | lt = x ≺ merge cmp xs′ ys
P⇒′ (merge cmp xs ys) | x ≺ xs′ | y ≺ ys′ | eq = x ≺ merge cmp xs′ ys′
P⇒′ (merge cmp xs ys) | x ≺ xs′ | y ≺ ys′ | gt = y ≺ merge cmp xs ys′

Finally we can construct the intended stream by iterating this process:

′⇒ : forall {A} -> Stream′ A -> Stream A
′⇒ (x ≺ xs) ~ x ≺ ′⇒ (P⇒′ xs)

P⇒ : forall {A} -> StreamProg A -> Stream A
P⇒ xs = ′⇒ (P⇒′ xs)

Note that ′⇒ is guarded.

There is nothing above which stops us from adding the code of some ill-behaved function to StreamProg, for instance the tail function, or the following variant of map which looks at two elements at a time:

map₂ : forall {A B} -> (A -> B) -> Stream A -> Stream B
map₂ f (x ≺ y ≺ xs) ~ f x ≺ f y ≺ map₂ f xs

However, we would not be able to extend P⇒′. Consider an attempt at adding support for tail, for instance:

P⇒′ (tail xs) with P⇒′ xs
P⇒′ (tail xs) | x ≺ xs′ = P⇒′ xs′

The termination checker does not accept the second recursive call, since it is not performed at a structurally smaller argument. An attempt at handling map₂ would run into similar problems. One can observe that this approach turns problems with productivity into problems with termination.

The example above raises a problem; there are perfectly reasonable productive definitions involving tail or map₂, for instance the Fibonacci sequence:

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

Can these be handled? The answer is that they can, at the cost of some added complexity. One can index Stream′ and StreamProg on the number of elements that can safely be inspected, and let Stream′ contain a vector of the corresponding length instead of just a single element. In the case of the Fibonacci sequence there is an easier solution, though:

fibs : StreamProg ℕ
fibs ~ ↓ 0 ≺ zipWith _+_ fibs (↓ 1 ≺ fibs)

The approach above is inspired by Hancock and Setzer’s method for modelling possibly infinitely running programs in a total language (see Interactive programs in dependent type theory). At the meeting Conor McBride showed us his version of their approach, which went roughly along the following lines. First a type which indicates whether or not we are in a guarded context:

data Guardedness : Set where
  guarded   : Guardedness
  unguarded : Guardedness

Then an inductive representation of right-hand sides, indexed on the kind of context they are supposed to appear in (guarded or unguarded):

data RHS (A S : Set) : Guardedness -> Set where
  _≺_  : forall {g} (x : A) (xs : RHS A S guarded) -> RHS A S g
  call : (x : S) -> RHS A S guarded

The xs argument to _≺_ is guarded, hence the index guarded; the result can appear in any context. Recursive calls (which take a Seed as their only argument) are only allowed in a guarded context. A generalised unfold function can then be defined as follows:

unfold : forall {A S} ->
         (S -> RHS A S unguarded) -> (S -> Stream A)
unfold {A} {S} step s ~ go (step s)
  where
  ready : RHS A S guarded -> RHS A S unguarded
  ready (x ≺ xs) ~ x ≺ xs
  ready (call x) ~ step x

  go : RHS A S unguarded -> Stream A
  go (x ≺ xs) ~ x ≺ go (ready xs)

This approach to productive definitions is less ad-hoc than the one above, and reasonably light-weight. However, it cannot easily handle functions like map (you cannot simply map a function over a right-hand side if you want the function to be applied to all elements of the resulting stream). I have also had trouble using the approach with indexed types. The ad-hoc approach above, while ad-hoc and monolithic, can easily handle indexed types.

A better solution in the long run may still be to change the language to include some more semantic termination/productivity criterion, like sized types.

Update (2010-01-27): There is now a paper which describes this technique.

Leave a Reply