In today’s FP Lunch I mentioned a problem with Coq’s guardedness checker.
Consider the following simple representation of languages (using Agda-like syntax):
codata L : Set where ε : L sym : Char -> L _·_ : L -> L -> L _∣_ : L -> L -> L
ε is the language consisting of the empty string,
sym c is the language containing only the character
_·_ encodes sequencing, and
_∣_ takes the union of two languages. I want to be able to represent languages with strings of arbitrary (but finite) length, so
L is a coinductive definition.
As an example of languages containing arbitrary-length strings, consider the following definitions:
mutual _⋆ : L -> L p ⋆ = ε ∣ p + _+ : L -> L p + = p · p ⋆
p ⋆ consists of strings made up of zero or more occurrences of
p + contains strings with one or more occurrences of
p. These definitions are well-founded, or productive, since they are (mutually) guarded; the right-hand sides consist of applications of one or more constructors to corecursive occurrences of the defined languages (and other, non-recursive expressions).
We can define more languages, like
p sepBy sep, which consists of one or more
ps separated by
_sepBy_ : L -> L -> L p sepBy sep = p · (sep · p) ⋆
This definition is productive since it is not (co)recursive.
Using the definitions above we can now define a simple expression language (here I have omitted some auxiliary definitions):
mutual expr : L expr = term sepBy addOp term : L term = factor sepBy mulOp factor : L factor = number ∣ sym '(' · expr · sym ')'
However, now we have a problem, because the definition above is not guarded;
sepBy is not a constructor. The previous definitions above are all accepted by Coq, which has support for coinductive types and guarded corecursive definitions, but
factor are not accepted.
The definitions are still productive, though. One way to see this is to inline the definitions (not simply the right-hand sides) of
_+ into the mutual block:
mutual expr : L expr = term · expr⋆ expr⋆ : L expr⋆ = ε ∣ expr+ expr+ : L expr+ = (addOp · term) · expr⋆ term : L term = factor · term⋆ term⋆ : L term⋆ = ε ∣ term+ term+ : L term+ = (mulOp · factor) · term⋆ factor : L factor = number ∣ sym '(' · expr · sym ')'
This shows (in a hand-wavy manner) that one can turn some non-guarded definitions into guarded ones. The definition above is not something which I actually want to use in practice, though.
Another approach would be to liberate the guardedness condition a little, stating that a corecursive occurrence is guarded if it occurs under one or more constructors or producers; a function is a producer for a given argument if all occurrences of this argument on the right-hand side are guarded (according to this new definition). Note that
sepBy are all producers (for all arguments), so with this definition of guardedness the definition of
expr would be accepted.
I like this definition since it is simple to state and understand (and hopefully correct…), and covers a useful class of definitions. However, I am a little wary of extending languages in ad-hoc ways. Which reasonable definitions are not covered by this extension? Are there other extensions which cover more cases but are still simple to understand? Maybe it is better to use a language where productivity is a first-class concept, so that the “productivity checker” can, in effect, be extended inside the language.