The trek goes on


Environmentally friendly syntax tree traversal

Posted at 20:30 on 10 August 2021, and revised at 16:23 on 11 October 2021

The last wave of reviewing has just subsided, and I set myself a task of understanding and reproducing Allais’s Agda strong normalisation proof included as part of the ‘POPLmark reloaded’ project. The proof is built upon Allais et al’s ‘generic syntax’ library, whose core is a traversal operator defined datatype-generically for syntax trees with the standard λ-style binders and variables as de Bruijn indices. My first step is to study an earlier, not yet datatype-generic version of the traversal operator reported in their CPP ’17 paper. Even simpler than what’s presented in the paper, I’ll just experiment with the most simply typed λ-calculus:

data _∋_ {A : Set} : List A → A → Set where
  zero : {x   : A} {xs : List A} →          x ∷ xs ∋ x
  suc  : {x y : A} {xs : List A} → xs ∋ x → y ∷ xs ∋ x

data Ty : Set where
  ι   : Ty
  _⇒_ : Ty → Ty → Ty

Cxt : Set
Cxt = List Ty

data _⊢_ : Cxt → Ty → Set where

  `_  : Γ ∋ τ
      → -----
        Γ ⊢ τ

  _·_ : Γ ⊢ σ ⇒ τ
      → Γ ⊢ σ
      → ---------
        Γ ⊢     τ

  ƛ_  : σ ∷ Γ ⊢ τ
      → -------------
            Γ ⊢ σ ⇒ τ

The traversal operator was generalised from renaming and substitution, which we should take a close look at before attempting the generalisation. First we look at renaming:

ren : Γ ⊢ τ → (∀ {σ} → Γ ∋ σ → Δ ∋ σ) → Δ ⊢ τ
ren (` i  ) ρ = ` ρ i
ren (t · u) ρ = ren t ρ · ren u ρ
ren (ƛ t  ) ρ = ƛ ren t λ { zero → zero; (suc i) → suc (ρ i) }

The function ren first takes a term of type τ whose free variables are drawn from the context Γ. Its second argument is an environment that maps every such free variable (of type Γ ∋ σ) to another free variable (of type Δ ∋ σ) drawn from a possibly different context Δ. The function then replaces every Γ-variable in the input term by the corresponding Δ-variable, thereby producing a term also of type τ but now under the context Δ. The ƛ case deserves particular attention: It goes under the binder and processes the sub-term t : σ ∷ Γ ⊢ τ recursively, whose context is extended by one variable of type σ, and the resulting sub-term is also under an extended context σ ∷ Δ. The second argument of the recursive call thus needs to be an extended environment where the new variable zero : σ ∷ Γ ∋ σ is mapped to ‘itself’ zero : σ ∷ Δ ∋ σ (because the new variable doesn’t need to be renamed), while the rest of the variables (of the form suc i) are still mapped to the corresponding variables ρ i according to the input environment ρ, except that ρ i — which is a de Bruijn index — should be incremented (because it is outside the newly introduced binder) and become a variable under an extended environment σ ∷ Δ rather than Δ, as required by the target type σ ∷ Δ ⊢ τ of the recursive call.

Substitution follows a similar pattern:

sub : Γ ⊢ τ → (∀ {σ} → Γ ∋ σ → Δ ⊢ σ) → Δ ⊢ τ
sub (` i  ) ρ = ρ i
sub (t · u) ρ = sub t ρ · sub u ρ
sub (ƛ t  ) ρ = ƛ sub t λ { zero → ` zero; (suc i) → ren (ρ i) suc }

The environment now contains terms rather than variables, although the terms are still indexed by Δ and σ — the contents of the environment will later be generalised to arbitrary values whose types are indexed by Δ and σ. The most interesting case is still the ƛ case, which proceeds in roughly the same way, except that the value for the new variable zero in the extended environment is now the variable term ` zero, and the way of updating a value/term of type Δ ⊢ τ to one of type σ ∷ Δ ⊢ τ (with an extended environment) is incrementing all the free variables in the term by λ t → ren t suc.

We can now generalise the pattern to a traversal operator

sem : (S : Semantics) → Γ ⊢ τ → Env Γ (S .Val) Δ → S .Com Δ τ

which translates a term of type Γ ⊢ τ to a function from an environment for Γ containing values of type Val to something of type Com (module the indices). More precisely, Val and Com are type families indexed by Cxt and Ty, for which we define a type:

Model : Set₁
Model = Cxt → Ty → Set

Env is just a record type wrapping the function type for environments to aid type inference in Agda:

record Env (Γ : Cxt) (M : Model) (Δ : Cxt) : Set where
  constructor env
    lookup : Γ ∋ τ → M Δ τ

Val and Com are part of a record S of type Semantics, whose other fields prove that Val and Com satisfy some necessary requirements for the traversal to work. The record type is defined as

record Semantics : Set₁ where
    Val : Model
    thV : Val Δ τ → Val (σ ∷ Δ) τ
    Com : Model
    var : Val Δ τ → Com Δ τ
    app : Com Δ (σ ⇒ τ) → Com Δ σ → Com Δ τ
    lam : (Val (σ ∷ Δ) σ → Com (σ ∷ Δ) τ) → Com Δ (σ ⇒ τ)

On Val we require that the type can be ‘thinned’ (by thV), that is, the context in the type can be extended; this is used when we go under a binder and need to adapt the environment. Note that thinning isn’t needed for Com, and this offers one justification for the separation of Val and Com: if we also stored Com in the environment, then we’d have to do thinning on Com, which may be harder. (For example, if we store terms rather than variables in the environment during renaming, then we’d have to do thinning on terms, which would be renaming itself.) Then for each term constructor we need a corresponding case specifying how to compute the semantics inductively. The var case takes the result of looking up a variable in the environment, which has type Val Δ τ, and should specify how to turn it into a Com Δ τ. The corresponding case of sem is

sem S (` i) ρ = S .var (lookup ρ i)

The app case assumes that the semantics (of types Com Δ (σ ⇒ τ) and Com Δ σ) have been computed inductively from the two sub-terms of an application, and should specify how to compute the semantics of the whole term (of type Com Δ τ). The corresponding case of sem is

sem S (t · u) ρ = S .app (sem S t ρ) (sem S u ρ)

The lam case works on an abstraction and basically assumes that the semantics of the body (of type Com (σ ∷ Δ) τ) has been inductively computed; this semantics in fact depends on the Val (σ ∷ Δ) σ to which the newly introduced variable is mapped by the extended environment, so what’s inductively computed from the body is a function from Val (σ ∷ Δ) σ to Com (σ ∷ Δ) τ. From this function we should then be able to compute the semantics of the abstraction (of type Com Δ (σ ⇒ τ)). The corresponding case of sem applies lam to the function that computes the semantics of the body inductively with the environment extended with the function’s input after its original contents are thinned,

sem S (ƛ t) ρ = S .lam λ v → sem S t (v ▸ modify (S .thV) ρ)

where environment thinning is done with the helper function

modify : {M M' : Model} → (∀ {τ} → M Δ τ → M' Δ' τ) → Env Γ M Δ → Env Γ M' Δ'
lookup (modify f ρ) i = f (lookup ρ i)

and environment extension is defined by

_▸_ : {M : Model} → M Δ σ → Env Γ M Δ → Env (σ ∷ Γ) M Δ
lookup (m ▸ ρ) zero    = m
lookup (m ▸ ρ) (suc x) = lookup ρ x

Renaming and substitution are now easily implemented as instances of Semantics:

ren : Semantics
ren .Val Δ τ = Δ ∋ τ
ren .thV     = suc
ren .Com Δ τ = Δ ⊢ τ
ren .var i   = ` i
ren .app t u = t · u
ren .lam f   = ƛ f zero

sub : Semantics
sub .Val Δ τ = Δ ⊢ τ
sub .thV t   = sem ren t (env suc)
sub .Com Δ τ = Δ ⊢ τ
sub .var i   = i
sub .app t u = t · u
sub .lam f   = ƛ f (` zero)

Notably, environment management is entirely hidden from the user of the traversal operator (i.e., the provider of a Semantics), which I believe is the essence of the operator.

For a differently flavoured but probably also illuminating example, we can implement a rudimentary term printer which interprets terms as strings in a state monad where the state is a string of single-character fresh names; the environment is used to store strings to which variables are printed. Notably, the lam case is quite natural: we extract a fresh varName from the state and apply the inductively assumed function f to varName to get the string for the body, in which the occurrences of the new variable are printed to varName.

printer : Semantics
printer .Val Δ τ   = Char
printer .thV c     = c
printer .Com Δ τ   = State String String
printer .var c     = return (fromChar c)
printer .app ms mt = do
  s ← ms
  t ← mt
  return ("(" ++ s ++ t ++ ")")
printer .lam f     = do
  fresh ← get
  let (varName , fresh') = split fresh
  put fresh'
  body ← f varName
  put fresh
  return ("(λ" ++ fromChar varName ++ "." ++ body ++ ")")
    split : String → Char × String
    split s with toList s
    split s | []     = ' ' , ""
    split s | x ∷ xs = x   , fromList xs

As a test case, the term

testTerm : [] ⊢ ((ι ⇒ ι) ⇒ (ι ⇒ ι) ⇒ ι ⇒ ι) ⇒ ι ⇒ ι
testTerm = ƛ (ƛ ` suc zero · ` zero · ` zero) · (ƛ ` zero)

can be printed as

proj₁ (sem {Δ = []} printer testTerm (env λ ()) "abcdefg")
  = "(λa.((λb.((ab)b))(λb.b)))"

(with excessive parentheses which I don’t bother to optimise away).

Remark. In the CPP ’17 paper, the antecedent of lam (and similarly for thV) is a stronger Kripke-style function:

lam : (∀ Δ' → Δ ⊆ Δ' → Val Δ' σ → Com Δ' τ) → Com Δ (σ ⇒ τ)

I suspect that this is needed only in their example on normalisation by evaluation, which I didn’t try, although I imagine that being able to inductively assume a Kripke-style function in the lam case is another selling point of (the full version of) the traversal operator. (End of remark.)

An induction operator?

The CPP ’17 paper culminates with some theorems about the traversal operator, one of which is simulation of semantics, which abstracts logical relation proofs. As a typical example, I decide to try and prove the free theorems for terms interpreted as polymorphic functions. This Semantics is fairly standard. At type level, we can interpret a Ty as an Agda function type where all the occurrences of ι become a given Agda type X : Set:

PolyTy : Ty → Set → Set
PolyTy ι       X = X
PolyTy (σ ⇒ τ) X = PolyTy σ X → PolyTy τ X

This is then extended to contexts and judgements:

PolyCxt : Cxt → Set → Set
PolyCxt []      X = ⊤
PolyCxt (σ ∷ Γ) X = PolyCxt Γ X × PolyTy σ X

PolyJdg : Cxt → Ty → Set → Set
PolyJdg Γ τ X = PolyCxt Γ X → PolyTy τ X

By using PolyJdg as the models, we can define a functional Semantics to interpret terms as functions. In particular, variables are interpreted as projections, as we’ll see in the lam and thV cases below: initially a newly introduced variable of type σ ∷ Δ ∋ σ is assigned the second projection proj₂ : PolyCxt Δ X × PolyTy σ X → PolyTy σ X in the environment; as the environment is pushed under binders, the projections are thinned by pre-composing proj₁. The rest of the Semantics is straightforward.

fun : Set → Semantics
fun X .Val Δ τ     = PolyJdg Δ τ X
fun X .thV f       = f ∘ proj₁
fun X .Com Δ τ     = PolyJdg Δ τ X
fun X .var         = id
fun X .app mf mx r = (mf r) (mx r)
fun X .lam f       = curry (f proj₂)

Given a function f : X → Y, we say that two values x : X and y : Y are related exactly when f x ≡ y, and that two functions are related exactly when they map related inputs to related outputs. We can formalise this definition of relatedness for Ty, contexts, and judgements:

PolyTyR : {X Y : Set} (f : X → Y) → PolyTy τ X → PolyTy τ Y → Set
PolyTyR {τ = ι    }                 f x y = f x ≡ y
PolyTyR {τ = σ ⇒ τ} {X = X} {Y = Y} f g h =
  {x : PolyTy σ X} {y : PolyTy σ Y} → PolyTyR f x y → PolyTyR f (g x) (h y)

PolyCxtR : {X Y : Set} (f : X → Y) → PolyCxt Γ X → PolyCxt Γ Y → Set
PolyCxtR {Γ = []   } f tt      tt        = ⊤
PolyCxtR {Γ = σ ∷ Γ} f (p , q) (p' , q') = PolyCxtR f p p' × PolyTyR f q q'

PolyJdgR : {X Y : Set} (f : X → Y) → PolyJdg Γ τ X → PolyJdg Γ τ Y → Set
PolyJdgR {Γ = Γ} {X = X} {Y = Y} f g h =
  {x : PolyCxt Γ X} {y : PolyCxt Γ Y} → PolyCxtR f x y
                                      → PolyTyR f (g x) (h y)

A free theorem then states that the functions sem (fun X) t ρX and sem (fun Y) t ρY are related (satisfy the relation PolyJdgR f) whenever ρX and ρY are pointwise related. The following is a concrete example of a free theorem,

(t : [] ⊢ ι ⇒ ι) (X Y : Set) (f : X → Y)
  → let g = sem (fun X) t (env λ ()) tt
        h = sem (fun Y) t (env λ ()) tt
    in  (x : X) → f (g x) ≡ h (f x)

from which one can further deduce that the functional semantics of any t : [] ⊢ ι ⇒ ι can only be the identity function.

It is possible to establish a general theorem stating that if two semantics are in simulation — that is, all the cases preserve some specified relations on the models — then the traversals of a term using the two semantics are related whenever the environments are pointwise related. Free theorems then become a special case. The simulation theorem is pasted below but only for archival purposes — I don’t intend to go into the details:

record Sim (S S' : Semantics) : Set₁ where
    RVal : S .Val Γ τ → S' .Val Γ τ → Set
    thRV : {v : S .Val Γ τ} {v' : S' .Val Γ τ}
         → RVal v v' → RVal {Γ = σ ∷ Γ} (S .thV v) (S' .thV v')
    RCom : S .Com Γ τ → S' .Com Γ τ → Set
    var  : {v : S .Val Γ τ} {v' : S' .Val Γ τ}
         → RVal v v' → RCom (S .var v) (S' .var v')
    app  : {c  : S  .Com Γ (σ ⇒ τ)} {d  : S  .Com Γ σ}
           {c' : S' .Com Γ (σ ⇒ τ)} {d' : S' .Com Γ σ}
         → RCom c c' → RCom d d' → RCom (S .app c d) (S' .app c' d')
    lam  : {f  : S  .Val (σ ∷ Γ) σ → S  .Com (σ ∷ Γ) τ}
           {f' : S' .Val (σ ∷ Γ) σ → S' .Com (σ ∷ Γ) τ}
         → ((v : S .Val (σ ∷ Γ) σ) (v' : S' .Val (σ ∷ Γ) σ)
           → RVal v v' → RCom (f v) (f' v'))
         → RCom (S .lam f) (S' .lam f')

record REnv {M M' : Model} (R : ∀ {Δ τ} → M Δ τ → M' Δ τ → Set)
            {Γ Δ : Cxt} (ρ : Env Γ M Δ) (ρ' : Env Γ M' Δ) : Set where
  constructor renv
    lookup : (x : Γ ∋ τ) → R (lookup ρ x) (lookup ρ' x)

_▸ᴿ_ : {M M' : Model} {R : ∀ {Δ τ} → M Δ τ → M' Δ τ → Set}
     → {v : M Δ τ} {v' : M' Δ τ} → R v v'
     → {ρ : Env Γ M Δ} {ρ' : Env Γ M' Δ}
     → REnv R ρ ρ' → REnv R (v ▸ ρ) (v' ▸ ρ')
lookup (vr ▸ᴿ ρr) zero    = vr
lookup (vr ▸ᴿ ρr) (suc i) = lookup ρr i

sim : (S S' : Semantics) (R : Sim S S') → (t : Γ ⊢ τ)
    → (ρ : Env Γ (S .Val) Δ) (ρ' : Env Γ (S' .Val) Δ)
    → REnv (R .RVal) ρ ρ' → R .RCom (sem S t ρ) (sem S' t ρ')
sim S S' R (` i  ) ρ ρ' ρr = R .var (lookup ρr i)
sim S S' R (t · u) ρ ρ' ρr = R .app (sim S S' R t ρ ρ' ρr)
                                    (sim S S' R u ρ ρ' ρr)
sim S S' R (ƛ t  ) ρ ρ' ρr =
  R .lam λ v v' vr → sim S S' R t (v  ▸  env  (S  .thV ∘ lookup ρ ))
                                      -- modify (S .thV) ρ
                                  (v' ▸  env  (S' .thV ∘ lookup ρ'))
                                  (vr ▸ᴿ renv (R .thRV ∘ lookup ρr))

The only observation I intend to make is that sim is computationally the same as sem if we see ρ, ρ', and ρr as a single environment storing a pair of related semantics for every variable. The type of sem is too weak to subsume sim though: right now sem is more like a fold operator, but what we need here is more like an induction operator, whose result type can depend on the input term and environment. Here is my attempt at formulating the induction operator, but again I won’t go into the details, for the reasons that I’ll give at the end of this post:

record Semantics' : Set₁ where
    VM  : Model
    thV : VM Δ τ → VM (σ ∷ Δ) τ
    TM  : Γ ⊢ τ → Env Γ VM Δ → Set
    var : (i : Γ ∋ τ) (ρ : Env Γ VM Δ) → TM (` i) ρ
    app : {t : Γ ⊢ σ ⇒ τ} {u : Γ ⊢ σ} {ρ : Env Γ VM Δ}
        → TM t ρ → TM u ρ → TM (t · u) ρ
    lam : {t : σ ∷ Γ ⊢ τ} {ρ : Env Γ VM Δ}
        → ((v : VM (σ ∷ Δ) σ) → TM t (v ▸ modify thV ρ)) → TM (ƛ t) ρ

sem' : (S : Semantics') (t : Γ ⊢ τ) (ρ : Env Γ (S .VM) Δ) → S .TM t ρ
sem' S (` i  ) ρ = S .var i ρ
sem' S (t · u) ρ = S .app (sem' S t ρ) (sem' S u ρ)
sem' S (ƛ t  ) ρ = S .lam λ v → sem' S t (v ▸ modify (S .thV) ρ)

It is possible to write everything we’ve seen so far with this induction operator, in particular the simulation theorem, although some non-trivial type transportation starts to show up, which involves function extensionality because it’s about equating different ways of extending environments, which are functional by nature (but fortunately we now have Cubical Agda, where function extensionality is naturally provable). It quickly becomes obvious that this operator isn’t general enough though. The CPP ’17 paper goes on to discuss fusion of semantics, so that for example we can prove that two renamings can be fused into one, which can be proved with sem' as follows (again the details aren’t important, and in fact the lam case isn’t completely shown):

compose : {M : Model} → Env Γ _∋_ Δ → Env Δ M Θ → Env Γ M Θ
lookup (compose ρ ρ') i = lookup ρ' (lookup ρ i)

renren : Semantics'
renren .VM Δ τ = Δ ∋ τ
renren .thV = suc
renren .TM {Δ = Δ} t ρ =
  {Θ : Cxt} (ρ' : Env Δ (ren .VM) Θ)
  → sem' ren (sem' ren t ρ) ρ' ≡ sem' ren t (compose ρ ρ')
renren .var i ρ ρ' = refl
renren .app mteq mueq ρ' = cong₂ _·_ (mteq ρ') (mueq ρ')
renren .lam {t = t} f ρ' =
  cong ƛ_ (f zero (zero ▸ modify suc ρ') ∙ cong (sem' ren t) ?)

Note that, in the motive TM, besides the environment ρ provided by the induction operator, we also need another environment ρ' for the second renaming, and have to manually introduce a universal quantification. Subsequently we have to manage ρ' manually, notably in the lam case, where we even have to thin and extend ρ' ourselves, which should have been taken care of by the operator. Maybe the induction operator can be further generalised to manage multiple environments, but it starts to look ad hoc. Therefore, even though my first instinct as a dependently typed programmer is to construct an induction operator so as to deal with both computation and proof in one go, it’s in fact not that obvious how to promote traversal to a natural and sufficiently powerful form of induction — no wonder that the paper merely presented the traversal operator and proved its properties separately.

Agda file: Kit.agda