The last wave of reviewing has just subsided, and I set myself a task of understanding and reproducing Allais’s Agda strong normalisation proof included as part of the ‘POPLmark reloaded’ project. The proof is built upon Allais et al’s ‘generic syntax’ library, whose core is a traversal operator defined datatype-generically for syntax trees with the standard λ-style binders and variables as de Bruijn indices. My first step is to study an earlier, not yet datatype-generic version of the traversal operator reported in their CPP ’17 paper. Even simpler than what’s presented in the paper, I’ll just experiment with the most simply typed λ-calculus:
data _∋_ {A : Set} : List A → A → Set where
zero : {x : A} {xs : List A} → x ∷ xs ∋ x
suc : {x y : A} {xs : List A} → xs ∋ x → y ∷ xs ∋ x
data Ty : Set where
ι : Ty
_⇒_ : Ty → Ty → Ty
Cxt : Set
Cxt = List Ty
data _⊢_ : Cxt → Ty → Set where
`_ : Γ ∋ τ
→ -----
Γ ⊢ τ
_·_ : Γ ⊢ σ ⇒ τ
→ Γ ⊢ σ
→ ---------
Γ ⊢ τ
ƛ_ : σ ∷ Γ ⊢ τ
→ -------------
Γ ⊢ σ ⇒ τ
The traversal operator was generalised from renaming and substitution, which we should take a close look at before attempting the generalisation. First we look at renaming:
ren : Γ ⊢ τ → (∀ {σ} → Γ ∋ σ → Δ ∋ σ) → Δ ⊢ τ
ren (` i ) ρ = ` ρ i
ren (t · u) ρ = ren t ρ · ren u ρ
ren (ƛ t ) ρ = ƛ ren t λ { zero → zero; (suc i) → suc (ρ i) }
The function ren
first takes a term of type τ
whose free variables are drawn from the context Γ
.
Its second argument is an environment that maps every such free variable (of type Γ ∋ σ
) to another free variable (of type Δ ∋ σ
) drawn from a possibly different context Δ
.
The function then replaces every Γ
-variable in the input term by the corresponding Δ
-variable, thereby producing a term also of type τ
but now under the context Δ
.
The ƛ
case deserves particular attention:
It goes under the binder and processes the sub-term t : σ ∷ Γ ⊢ τ
recursively, whose context is extended by one variable of type σ
, and the resulting sub-term is also under an extended context σ ∷ Δ
.
The second argument of the recursive call thus needs to be an extended environment where the new variable zero : σ ∷ Γ ∋ σ
is mapped to ‘itself’ zero : σ ∷ Δ ∋ σ
(because the new variable doesn’t need to be renamed), while the rest of the variables (of the form suc i
) are still mapped to the corresponding variables ρ i
according to the input environment ρ
, except that ρ i
— which is a de Bruijn index — should be incremented (because it is outside the newly introduced binder) and become a variable under an extended environment σ ∷ Δ
rather than Δ
, as required by the target type σ ∷ Δ ⊢ τ
of the recursive call.
Substitution follows a similar pattern:
sub : Γ ⊢ τ → (∀ {σ} → Γ ∋ σ → Δ ⊢ σ) → Δ ⊢ τ
sub (` i ) ρ = ρ i
sub (t · u) ρ = sub t ρ · sub u ρ
sub (ƛ t ) ρ = ƛ sub t λ { zero → ` zero; (suc i) → ren (ρ i) suc }
The environment now contains terms rather than variables, although the terms are still indexed by Δ
and σ
— the contents of the environment will later be generalised to arbitrary values whose types are indexed by Δ
and σ
.
The most interesting case is still the ƛ
case, which proceeds in roughly the same way, except that the value for the new variable zero
in the extended environment is now the variable term ` zero
, and the way of updating a value/term of type Δ ⊢ τ
to one of type σ ∷ Δ ⊢ τ
(with an extended environment) is incrementing all the free variables in the term by λ t → ren t suc
.
We can now generalise the pattern to a traversal operator
sem : (S : Semantics) → Γ ⊢ τ → Env Γ (S .Val) Δ → S .Com Δ τ
which translates a term of type Γ ⊢ τ
to a function from an environment for Γ
containing values of type Val
to something of type Com
(module the indices).
More precisely, Val
and Com
are type families indexed by Cxt
and Ty
, for which we define a type:
Model : Set₁
Model = Cxt → Ty → Set
Env
is just a record type wrapping the function type for environments to aid type inference in Agda:
record Env (Γ : Cxt) (M : Model) (Δ : Cxt) : Set where
constructor env
field
lookup : Γ ∋ τ → M Δ τ
Val
and Com
are part of a record S
of type Semantics
, whose other fields prove that Val
and Com
satisfy some necessary requirements for the traversal to work.
The record type is defined as
record Semantics : Set₁ where
field
Val : Model
thV : Val Δ τ → Val (σ ∷ Δ) τ
Com : Model
var : Val Δ τ → Com Δ τ
app : Com Δ (σ ⇒ τ) → Com Δ σ → Com Δ τ
lam : (Val (σ ∷ Δ) σ → Com (σ ∷ Δ) τ) → Com Δ (σ ⇒ τ)
On Val
we require that the type can be ‘thinned’ (by thV
), that is, the context in the type can be extended; this is used when we go under a binder and need to adapt the environment.
Note that thinning isn’t needed for Com
, and this offers one justification for the separation of Val
and Com
: if we also stored Com
in the environment, then we’d have to do thinning on Com
, which may be harder.
(For example, if we store terms rather than variables in the environment during renaming, then we’d have to do thinning on terms, which would be renaming itself.)
Then for each term constructor we need a corresponding case specifying how to compute the semantics inductively.
The var
case takes the result of looking up a variable in the environment, which has type Val Δ τ
, and should specify how to turn it into a Com Δ τ
.
The corresponding case of sem
is
sem S (` i) ρ = S .var (lookup ρ i)
The app
case assumes that the semantics (of types Com Δ (σ ⇒ τ)
and Com Δ σ
) have been computed inductively from the two sub-terms of an application, and should specify how to compute the semantics of the whole term (of type Com Δ τ
).
The corresponding case of sem
is
sem S (t · u) ρ = S .app (sem S t ρ) (sem S u ρ)
The lam
case works on an abstraction and basically assumes that the semantics of the body (of type Com (σ ∷ Δ) τ
) has been inductively computed; this semantics in fact depends on the Val (σ ∷ Δ) σ
to which the newly introduced variable is mapped by the extended environment, so what’s inductively computed from the body is a function from Val (σ ∷ Δ) σ
to Com (σ ∷ Δ) τ
.
From this function we should then be able to compute the semantics of the abstraction (of type Com Δ (σ ⇒ τ)
).
The corresponding case of sem
applies lam
to the function that computes the semantics of the body inductively with the environment extended with the function’s input after its original contents are thinned,
sem S (ƛ t) ρ = S .lam λ v → sem S t (v ▸ modify (S .thV) ρ)
where environment thinning is done with the helper function
modify : {M M' : Model} → (∀ {τ} → M Δ τ → M' Δ' τ) → Env Γ M Δ → Env Γ M' Δ'
lookup (modify f ρ) i = f (lookup ρ i)
and environment extension is defined by
_▸_ : {M : Model} → M Δ σ → Env Γ M Δ → Env (σ ∷ Γ) M Δ
lookup (m ▸ ρ) zero = m
lookup (m ▸ ρ) (suc x) = lookup ρ x
Renaming and substitution are now easily implemented as instances of Semantics
:
ren : Semantics
ren .Val Δ τ = Δ ∋ τ
ren .thV = suc
ren .Com Δ τ = Δ ⊢ τ
ren .var i = ` i
ren .app t u = t · u
ren .lam f = ƛ f zero
sub : Semantics
sub .Val Δ τ = Δ ⊢ τ
sub .thV t = sem ren t (env suc)
sub .Com Δ τ = Δ ⊢ τ
sub .var i = i
sub .app t u = t · u
sub .lam f = ƛ f (` zero)
Notably, environment management is entirely hidden from the user of the traversal operator (i.e., the provider of a Semantics
), which I believe is the essence of the operator.
For a differently flavoured but probably also illuminating example, we can implement a rudimentary term printer which interprets terms as strings in a state monad where the state is a string of single-character fresh names; the environment is used to store strings to which variables are printed.
Notably, the lam
case is quite natural: we extract a fresh varName
from the state and apply the inductively assumed function f
to varName
to get the string for the body, in which the occurrences of the new variable are printed to varName
.
printer : Semantics
printer .Val Δ τ = Char
printer .thV c = c
printer .Com Δ τ = State String String
printer .var c = return (fromChar c)
printer .app ms mt = do
s ← ms
t ← mt
return ("(" ++ s ++ t ++ ")")
printer .lam f = do
fresh ← get
let (varName , fresh') = split fresh
put fresh'
body ← f varName
put fresh
return ("(λ" ++ fromChar varName ++ "." ++ body ++ ")")
where
split : String → Char × String
split s with toList s
split s | [] = ' ' , ""
split s | x ∷ xs = x , fromList xs
As a test case, the term
testTerm : [] ⊢ ((ι ⇒ ι) ⇒ (ι ⇒ ι) ⇒ ι ⇒ ι) ⇒ ι ⇒ ι
testTerm = ƛ (ƛ ` suc zero · ` zero · ` zero) · (ƛ ` zero)
can be printed as
proj₁ (sem {Δ = []} printer testTerm (env λ ()) "abcdefg")
= "(λa.((λb.((ab)b))(λb.b)))"
(with excessive parentheses which I don’t bother to optimise away).
Remark. In the CPP ’17 paper, the antecedent of lam
(and similarly for thV
) is a stronger Kripke-style function:
lam : (∀ Δ' → Δ ⊆ Δ' → Val Δ' σ → Com Δ' τ) → Com Δ (σ ⇒ τ)
I suspect that this is needed only in their example on normalisation by evaluation, which I didn’t try, although I imagine that being able to inductively assume a Kripke-style function in the lam
case is another selling point of (the full version of) the traversal operator. (End of remark.)
The CPP ’17 paper culminates with some theorems about the traversal operator, one of which is simulation of semantics, which abstracts logical relation proofs. As a typical example, I decide to try and prove the free theorems for terms interpreted as polymorphic functions.
This Semantics
is fairly standard.
At type level, we can interpret a Ty
as an Agda function type where all the occurrences of ι
become a given Agda type X : Set
:
PolyTy : Ty → Set → Set
PolyTy ι X = X
PolyTy (σ ⇒ τ) X = PolyTy σ X → PolyTy τ X
This is then extended to contexts and judgements:
PolyCxt : Cxt → Set → Set
PolyCxt [] X = ⊤
PolyCxt (σ ∷ Γ) X = PolyCxt Γ X × PolyTy σ X
PolyJdg : Cxt → Ty → Set → Set
PolyJdg Γ τ X = PolyCxt Γ X → PolyTy τ X
By using PolyJdg
as the models, we can define a functional Semantics
to interpret terms as functions.
In particular, variables are interpreted as projections, as we’ll see in the lam
and thV
cases below: initially a newly introduced variable of type σ ∷ Δ ∋ σ
is assigned the second projection proj₂ : PolyCxt Δ X × PolyTy σ X → PolyTy σ X
in the environment; as the environment is pushed under binders, the projections are thinned by pre-composing proj₁
.
The rest of the Semantics
is straightforward.
fun : Set → Semantics
fun X .Val Δ τ = PolyJdg Δ τ X
fun X .thV f = f ∘ proj₁
fun X .Com Δ τ = PolyJdg Δ τ X
fun X .var = id
fun X .app mf mx r = (mf r) (mx r)
fun X .lam f = curry (f proj₂)
Given a function f : X → Y
, we say that two values x : X
and y : Y
are related exactly when f x ≡ y
, and that two functions are related exactly when they map related inputs to related outputs.
We can formalise this definition of relatedness for Ty
, contexts, and judgements:
PolyTyR : {X Y : Set} (f : X → Y) → PolyTy τ X → PolyTy τ Y → Set
PolyTyR {τ = ι } f x y = f x ≡ y
PolyTyR {τ = σ ⇒ τ} {X = X} {Y = Y} f g h =
{x : PolyTy σ X} {y : PolyTy σ Y} → PolyTyR f x y → PolyTyR f (g x) (h y)
PolyCxtR : {X Y : Set} (f : X → Y) → PolyCxt Γ X → PolyCxt Γ Y → Set
PolyCxtR {Γ = [] } f tt tt = ⊤
PolyCxtR {Γ = σ ∷ Γ} f (p , q) (p' , q') = PolyCxtR f p p' × PolyTyR f q q'
PolyJdgR : {X Y : Set} (f : X → Y) → PolyJdg Γ τ X → PolyJdg Γ τ Y → Set
PolyJdgR {Γ = Γ} {X = X} {Y = Y} f g h =
{x : PolyCxt Γ X} {y : PolyCxt Γ Y} → PolyCxtR f x y
→ PolyTyR f (g x) (h y)
A free theorem then states that the functions sem (fun X) t ρX
and sem (fun Y) t ρY
are related (satisfy the relation PolyJdgR f
) whenever ρX
and ρY
are pointwise related.
The following is a concrete example of a free theorem,
(t : [] ⊢ ι ⇒ ι) (X Y : Set) (f : X → Y)
→ let g = sem (fun X) t (env λ ()) tt
h = sem (fun Y) t (env λ ()) tt
in (x : X) → f (g x) ≡ h (f x)
from which one can further deduce that the functional semantics of any t : [] ⊢ ι ⇒ ι
can only be the identity function.
It is possible to establish a general theorem stating that if two semantics are in simulation — that is, all the cases preserve some specified relations on the models — then the traversals of a term using the two semantics are related whenever the environments are pointwise related. Free theorems then become a special case. The simulation theorem is pasted below but only for archival purposes — I don’t intend to go into the details:
record Sim (S S' : Semantics) : Set₁ where
field
RVal : S .Val Γ τ → S' .Val Γ τ → Set
thRV : {v : S .Val Γ τ} {v' : S' .Val Γ τ}
→ RVal v v' → RVal {Γ = σ ∷ Γ} (S .thV v) (S' .thV v')
RCom : S .Com Γ τ → S' .Com Γ τ → Set
var : {v : S .Val Γ τ} {v' : S' .Val Γ τ}
→ RVal v v' → RCom (S .var v) (S' .var v')
app : {c : S .Com Γ (σ ⇒ τ)} {d : S .Com Γ σ}
{c' : S' .Com Γ (σ ⇒ τ)} {d' : S' .Com Γ σ}
→ RCom c c' → RCom d d' → RCom (S .app c d) (S' .app c' d')
lam : {f : S .Val (σ ∷ Γ) σ → S .Com (σ ∷ Γ) τ}
{f' : S' .Val (σ ∷ Γ) σ → S' .Com (σ ∷ Γ) τ}
→ ((v : S .Val (σ ∷ Γ) σ) (v' : S' .Val (σ ∷ Γ) σ)
→ RVal v v' → RCom (f v) (f' v'))
→ RCom (S .lam f) (S' .lam f')
record REnv {M M' : Model} (R : ∀ {Δ τ} → M Δ τ → M' Δ τ → Set)
{Γ Δ : Cxt} (ρ : Env Γ M Δ) (ρ' : Env Γ M' Δ) : Set where
constructor renv
field
lookup : (x : Γ ∋ τ) → R (lookup ρ x) (lookup ρ' x)
_▸ᴿ_ : {M M' : Model} {R : ∀ {Δ τ} → M Δ τ → M' Δ τ → Set}
→ {v : M Δ τ} {v' : M' Δ τ} → R v v'
→ {ρ : Env Γ M Δ} {ρ' : Env Γ M' Δ}
→ REnv R ρ ρ' → REnv R (v ▸ ρ) (v' ▸ ρ')
lookup (vr ▸ᴿ ρr) zero = vr
lookup (vr ▸ᴿ ρr) (suc i) = lookup ρr i
sim : (S S' : Semantics) (R : Sim S S') → (t : Γ ⊢ τ)
→ (ρ : Env Γ (S .Val) Δ) (ρ' : Env Γ (S' .Val) Δ)
→ REnv (R .RVal) ρ ρ' → R .RCom (sem S t ρ) (sem S' t ρ')
sim S S' R (` i ) ρ ρ' ρr = R .var (lookup ρr i)
sim S S' R (t · u) ρ ρ' ρr = R .app (sim S S' R t ρ ρ' ρr)
(sim S S' R u ρ ρ' ρr)
sim S S' R (ƛ t ) ρ ρ' ρr =
R .lam λ v v' vr → sim S S' R t (v ▸ env (S .thV ∘ lookup ρ ))
-- modify (S .thV) ρ
(v' ▸ env (S' .thV ∘ lookup ρ'))
(vr ▸ᴿ renv (R .thRV ∘ lookup ρr))
The only observation I intend to make is that sim
is computationally the same as sem
if we see ρ
, ρ'
, and ρr
as a single environment storing a pair of related semantics for every variable.
The type of sem
is too weak to subsume sim
though: right now sem
is more like a fold operator, but what we need here is more like an induction operator, whose result type can depend on the input term and environment.
Here is my attempt at formulating the induction operator, but again I won’t go into the details, for the reasons that I’ll give at the end of this post:
record Semantics' : Set₁ where
field
VM : Model
thV : VM Δ τ → VM (σ ∷ Δ) τ
TM : Γ ⊢ τ → Env Γ VM Δ → Set
var : (i : Γ ∋ τ) (ρ : Env Γ VM Δ) → TM (` i) ρ
app : {t : Γ ⊢ σ ⇒ τ} {u : Γ ⊢ σ} {ρ : Env Γ VM Δ}
→ TM t ρ → TM u ρ → TM (t · u) ρ
lam : {t : σ ∷ Γ ⊢ τ} {ρ : Env Γ VM Δ}
→ ((v : VM (σ ∷ Δ) σ) → TM t (v ▸ modify thV ρ)) → TM (ƛ t) ρ
sem' : (S : Semantics') (t : Γ ⊢ τ) (ρ : Env Γ (S .VM) Δ) → S .TM t ρ
sem' S (` i ) ρ = S .var i ρ
sem' S (t · u) ρ = S .app (sem' S t ρ) (sem' S u ρ)
sem' S (ƛ t ) ρ = S .lam λ v → sem' S t (v ▸ modify (S .thV) ρ)
It is possible to write everything we’ve seen so far with this induction operator, in particular the simulation theorem, although some non-trivial type transportation starts to show up, which involves function extensionality because it’s about equating different ways of extending environments, which are functional by nature (but fortunately we now have Cubical Agda, where function extensionality is naturally provable).
It quickly becomes obvious that this operator isn’t general enough though.
The CPP ’17 paper goes on to discuss fusion of semantics, so that for example we can prove that two renamings can be fused into one, which can be proved with sem'
as follows (again the details aren’t important, and in fact the lam
case isn’t completely shown):
compose : {M : Model} → Env Γ _∋_ Δ → Env Δ M Θ → Env Γ M Θ
lookup (compose ρ ρ') i = lookup ρ' (lookup ρ i)
renren : Semantics'
renren .VM Δ τ = Δ ∋ τ
renren .thV = suc
renren .TM {Δ = Δ} t ρ =
{Θ : Cxt} (ρ' : Env Δ (ren .VM) Θ)
→ sem' ren (sem' ren t ρ) ρ' ≡ sem' ren t (compose ρ ρ')
renren .var i ρ ρ' = refl
renren .app mteq mueq ρ' = cong₂ _·_ (mteq ρ') (mueq ρ')
renren .lam {t = t} f ρ' =
cong ƛ_ (f zero (zero ▸ modify suc ρ') ∙ cong (sem' ren t) ?)
Note that, in the motive TM
, besides the environment ρ
provided by the induction operator, we also need another environment ρ'
for the second renaming, and have to manually introduce a universal quantification.
Subsequently we have to manage ρ'
manually, notably in the lam
case, where we even have to thin and extend ρ'
ourselves, which should have been taken care of by the operator.
Maybe the induction operator can be further generalised to manage multiple environments, but it starts to look ad hoc.
Therefore, even though my first instinct as a dependently typed programmer is to construct an induction operator so as to deal with both computation and proof in one go, it’s in fact not that obvious how to promote traversal to a natural and sufficiently powerful form of induction — no wonder that the paper merely presented the traversal operator and proved its properties separately.
Agda file: Kit.agda