The trek goes on

0010

McBride’s razor

Posted at 22:02 on 30 July 2020, and revised at 16:34 on 11 October 2021

Due to the pandemic, our original special lecturer for FLOLAC ’20 cannot make the trip, and we (lecturers at IIS) have to settle for something we can teach ourselves, which is dependently typed programming (in Agda). We’ll jointly cover basic stuff such as vectors, equational reasoning combinators, and the shallow embedding of logic formulas and their derivations, and more complex stuff including the deep embedding of propositional logic — which is just simply typed lambda calculus — and a few of its meta-theoretic properties à la PLFA. To draw a contrast with extrinsic theorem proving, in the last session I want to present something that demonstrates the power/potential of intrinsically typed programming and is more exciting than vectors, and Conor McBride’s showcase example in his original ornament paper comes to mind, where the evaluation of additive expressions (which he calls ‘Hutton’s razor’) is compiled to instructions for a stack machine, and the preservation of semantics is proved purely by datatype indexing.

The focus of the three-hour session (including practicals) is the contrast between the extrinsic and intrinsic styles. The typical example is list append

_++_ : List A → List A → List A
[]       ++ ys = ys
(x ∷ xs) ++ ys = x ∷ (xs ++ ys)

and how it interacts with length:

++-length : (xs ys : List A) → length (xs ++ ys) ≡ length xs + length ys
++-length []       ys = refl
++-length (x ∷ xs) ys =
  begin
    length ((x ∷ xs) ++ ys)
  ≡⟨ refl ⟩
    length (x ∷ (xs ++ ys))
  ≡⟨ refl ⟩
    1 + length (xs ++ ys)
  ≡⟨ cong (1 +_) (++-length xs ys) ⟩
    1 + length xs + length ys
  ≡⟨ refl ⟩
    length (x ∷ xs) + length ys
  ∎

This proof is somewhat trivial though — we could have convinced ourselves of the truth of ++-length just by going through the definition of append and tracing the length of the lists appearing in the definition. In fact, the dependent type system can do the tracing for us: by working with vectors instead of lists, we can effortlessly make the definition of append carry the ++-length theorem implicitly, eliding an explicit proof.

_++_ : Vec A m → Vec A n → Vec A (m + n)
[]       ++ ys = ys
(x ∷ xs) ++ ys = x ∷ (xs ++ ys)

‘McBride’s razor’ pushes the idea to an extreme and constitutes a more intriguing example.

We start with a simple language of additive expressions and its evaluation function:

data Expr : Set where
  lit : ℕ → Expr
  _∔_ : Expr → Expr → Expr

eval : Expr → ℕ
eval (lit x) = x
eval (l ∔ r) = eval l + eval r

The eval function is non-trivially recursive. We might want to eliminate the recursion by compiling it to a series of simple instructions that can be executed sequentially (so that, for example, we can delegate the task of expression evaluation to an elementary school kid, who can do it just by following simple instructions without having to learn about recursion). The instructions we’ll use are for manipulating a stack: we can push a number onto a stack or replace the top two numbers of a stack with their sum, and we also need a sequential composition operator to form a series of instructions.

data Prog : Set where
  push : ℕ → Prog
  add  : Prog
  _▷_  : Prog → Prog → Prog

We’ll immediately encounter a problem, however, if we try to define the semantics of Prog:

⟦_⟧ : Prog → List ℕ → List ℕ
⟦ push x ⟧ xs           = x ∷ xs
⟦ add    ⟧ (x ∷ y ∷ xs) = (x + y) ∷ xs
⟦ p ▷ q  ⟧ xs           = ⟦ q ⟧ (⟦ p ⟧ xs)

The problematic clause is the second one — there is no guarantee that ⟦ add ⟧ only operates on lists with at least two elements, and thus the above function is not total. We could wrap the result in a Maybe, but this would greatly complicate the definition. With indexed datatypes we can do better: instead of coping with instructions that can fail, we can work with only instructions that operate on stacks with enough elements and thus are guaranteed to succeed. This requires tracking the size of a stack and how instructions change the stack size, so we define

Stack : ℕ → Set
Stack = Vec ℕ

and index Prog with the sizes of the stack before and after an instruction is executed:

data Prog : ℕ → ℕ → Set where
  push : ℕ → Prog n (1 + n)
  add  : Prog (2 + n) (1 + n)
  _▷_  : Prog k m → Prog m n → Prog k n

Now the semantics function type-checks and is total:

⟦_⟧ : Prog m n → Stack m → Stack n
⟦ push x ⟧ xs           = x ∷ xs
⟦ add    ⟧ (x ∷ y ∷ xs) = (x + y) ∷ xs
⟦ p ▷ q  ⟧ xs           = ⟦ q ⟧ (⟦ p ⟧ xs)

The compilation of an expression to a series of stack instructions performing its evaluation is pretty easy:

compile : Expr → Prog n (1 + n)
compile (lit x) = push x
compile (l ∔ r) = compile r ▷ compile l ▷ add

Usually we would run the compiled instructions on an empty stack and expect to get the evaluation result at the top of the stack, that is, we should have

(e : Expr) → head (⟦ compile e ⟧ []) ≡ eval e

The statement is too weak to be established by an induction though, and we have to strengthen it to one that fully characterises what ⟦ compile e ⟧ does, and then the induction will go through. The proof can be made more readable (but much longer) by using the equational reasoning combinators:

soundness : (e : Expr) (xs : Stack n) → ⟦ compile e ⟧ xs ≡ eval e ∷ xs
soundness (lit x) xs =
  begin
    ⟦ compile (lit x) ⟧ xs
  ≡⟨ refl ⟩
    ⟦ push x ⟧ xs
  ≡⟨ refl ⟩
    x ∷ xs
  ≡⟨ refl ⟩
    eval (lit x) ∷ xs
  ∎
soundness (l ∔ r) xs =
  begin
    ⟦ compile (l ∔ r) ⟧ xs
  ≡⟨ refl ⟩
    ⟦ compile r ▷ compile l ▷ add ⟧ xs
  ≡⟨ refl ⟩
    ⟦ add ⟧ (⟦ compile l ⟧ (⟦ compile r ⟧ xs))
  ≡⟨ cong (⟦ add ⟧ ∘ ⟦ compile l ⟧) (soundness r xs) ⟩
    ⟦ add ⟧ (⟦ compile l ⟧ (eval r ∷ xs))
  ≡⟨ cong ⟦ add ⟧ (soundness l (eval r ∷ xs)) ⟩
    ⟦ add ⟧ (eval l ∷ eval r ∷ xs)
  ≡⟨ refl ⟩
    (eval l + eval r) ∷ xs
  ≡⟨ refl ⟩
    eval (l ∔ r) ∷ xs
  ∎

Note that compile (l ∔ r) emits instructions for evaluating r before those for evaluating l — this order of putting the sub-results into the stack coincides with the order in which ⟦ add ⟧ sums the top two elements of the stack, so that we don’t have to invoke commutativity of addition in the soundness proof.

The key observation here is that the soundness argument can be carried out in essentially the same way as ++-length, because we can explain what a series of compiled instructions does by going through the definition of compile. Here the property we trace is more involved than length: it’s the semantics of the instructions. But in this particular case the type system can still manage to trace it for us entirely transparently! We index Prog with its semantics (i.e., fuse ⟦_⟧ into Prog in the same way as fusing length into List to get vectors):

data Prog⁺ : (m n : ℕ) → (Stack m → Stack n) → Set where
  push : (x : ℕ) → Prog⁺ n (1 + n) (x ∷_)
  add  : Prog⁺ (2 + n) (1 + n) (apply _+_)
  _▷_  : ∀ {f g} → Prog⁺ k m f → Prog⁺ m n g → Prog⁺ k n (g ∘ f)

where

apply : (ℕ → ℕ → ℕ) → Stack (2 + n) → Stack (1 + n)
apply f (x ∷ y ∷ xs) = f x y ∷ xs

We can then require at type level that our compiler should produce instructions that have a particular semantics, and exactly the same definition goes through.

compile⁺ : (e : Expr) → Prog⁺ n (1 + n) (eval e ∷_)
compile⁺ (lit x) = push x
compile⁺ (l ∔ r) = compile⁺ r ▷ compile⁺ l ▷ add

It is instructive to trace how the type-checking goes through: on the right-hand side of the compile⁺ (l ∔ r) clause, we have

compile⁺ r : Prog⁺      n  (1 + n) (eval r ∷_)
compile⁺ l : Prog⁺ (1 + n) (2 + n) (eval l ∷_)

so definitionally the whole expression on the right-hand side has type

compile⁺ r ▷ compile⁺ l ▷ add
  : Prog⁺ n (1 + n) (apply _+_ ∘ (eval l ∷_) ∘ (eval r ∷_))

where the semantic index can be shown to be definitionally equal to the required one:

  apply _+_ ∘ (eval l ∷_) ∘ (eval r ∷_)
=   { η-conversion; function composition }
  λ xs → apply _+_ (eval l ∷ eval r ∷ xs)
=   { definition of apply }
  λ xs → (eval l + eval r) ∷ xs
=   { definition of eval }
  λ xs → eval (l ∔ r) ∷ xs
=   { η-conversion }
  (eval (l ∔ r) ∷_)

We see that this is essentially the same reasoning as in soundness (l ∔ r), except that the induction hypotheses have been implicitly applied, so the whole reasoning goes through definitionally and hence can be carried out by the type-checker itself — if anything other than the induction hypotheses and definitional equalities was required, then type-checking would fail. Notably, in the definition of compile⁺ (l ∔ r), if we swapped the order of the instructions for r and l, then compile⁺ wouldn’t type-check. This can be considered either a feature or a bug: the type system guides you to write a program, which is good, but in this case it forces you to write a particular program but not the other one that’s also sensible, which is not so good.

Besides filling out the definitions above, there are some more exercises intended for the students to play with and think about at this point. The easiest one is extending the Expr language (and all the subsequent definitions) with multiplication (or some other operator of the student’s choice). Generalising this a bit, we could allow the user to supply whatever binary operator they want to use,

data Expr : Set where
  ...
  binOp : (ℕ → ℕ → ℕ) → Expr → Expr → Expr

although the corresponding version of Prog would have to store arbitrary binary operators, which may be infeasible for an actual machine (or an elementary school kid). This shows that, in formal verification, there is more than what we can prove — we may be able to prove very strong properties about something, but whether that something adequately models what we actually care about is a different and no less important matter. Another adequacy issue is that Prog is still non-trivially recursive: the composition operator still gives rise to a binary tree structure. To eliminate recursion, one way is to switch to a list representation,

data Instr : (m n : ℕ) → Set where
  push : ℕ → Instr n (1 + n)
  add  : Instr (2 + n) (1 + n)

data Prog : (m n : ℕ) → Set where
  end : Prog n n
  _▸_ : Instr k m → Prog m n → Prog k n

and we’ll see that the semantics function becomes a tail-recursive one:

⟦_⟧ : Prog m n → Stack m → Stack n
⟦ end        ⟧ xs           = xs
⟦ push x ▸ p ⟧ xs           = ⟦ p ⟧ (x ∷ xs)
⟦ add    ▸ p ⟧ (x ∷ y ∷ xs) = ⟦ p ⟧ ((x + y) ∷ xs)

The tail-recursive function can then be easily implemented as a machine that loops through the instructions and modifies the stack, since a tail-recursive function is just a (computable) representation of state transition (where the arguments to the function are the states). Another way to eliminate recursion, or at least the binary tree structure, is to rotate/normalise a tree to make it lean rightwards. Interestingly, this normalise function happens to be another good example for Prog⁺, since with Prog⁺ we can directly say in the type of normalise that it does not change the semantics:

normalise : ∀ {f} → Prog⁺ m n f → Prog⁺ m n f
normalise (push x) = push x
normalise add      = add
normalise (p ▷ q)  = append (normalise p) (normalise q)

where

append : ∀ {f g} → Prog⁺ k m f → Prog⁺ m n g → Prog⁺ k n (g ∘ f)
append (push x) q = push x ▷ q
append add      q = add ▷ q
append (p ▷ p') q = p ▷ append p' q

Putting adequacy issues aside, can we generalise binOp a bit more and allow operators of an arbitrary arity?

_^_ : Set → ℕ → Set
A ^ zero = ⊤
A ^ suc n = A × (A ^ n)

data Expr : Set where
  ...
  op : (n : ℕ) → (ℕ ^ n → ℕ) → Expr ^ n → Expr

This turns out to be problematic for the intrinsic approach, because crucial to the success of type-checking compile⁺ is the fact that the semantics of compiled instructions can be deemed judgementally equal to the semantics specified in the type, but for op n the equality in question contains n as a free variable, and cannot be established without an induction on n. By contrast, the extrinsic approach still works pretty much the same way, because there is no problem including an induction in the extrinsic proof. Similarly, as noted by Conor in the paper, if one tries to include some form of control structure in the Expr language (e.g., ifz : Expr → Expr → Expr → Expr with its semantics defined by eval (ifz c t e) = if eval c == 0 then eval t else eval e), they would also need to perform a case analysis on eval c to establish the necessary equality. These reveal the rigidity of the intrinsic approach.

One may have some doubt about what we actually prove with compile⁺. If we regard Prog⁺ as the foundational definition (and ensure that the semantic index is specified accurately), then constructing compile⁺ seems good enough. On the other hand, eventually we will execute a program on a stack machine step by step and want to ensure that this stepwise execution has the intended behaviour, but this stepwise execution is defined only on the original Prog datatype via ⟦_⟧, which has no formal relationship with compile⁺, so there is still a gap. The more reasonable story is therefore taking Prog and ⟦_⟧ as the foundational definition and Prog⁺ as an auxiliary definition, and linking the latter back to the former. This was done in Conor’s original paper by the recomputation lemma, which says that the index of an algebraically ornamented datatype can be recomputed by a fold on the original datatype. Specialised to the datatypes we’re dealing with, namely Prog⁺ and Prog, the recomputation lemma can be stated and proved as follows,

recomputation : ∀ {f} (p : Prog⁺ m n f) xs → ⟦ forget p ⟧ xs ≡ f xs
recomputation (push x) xs           = refl
recomputation add      (x ∷ y ∷ xs) = refl
recomputation (_▷_ {f = f} {g} p q) xs =
  begin
    ⟦ forget q ⟧ (⟦ forget p ⟧ xs)
  ≡⟨ recomputation q (⟦ forget p ⟧ xs) ⟩
   g (⟦ forget p ⟧ xs)
  ≡⟨ cong g (recomputation p xs) ⟩
    g (f xs)
  ∎

where forget erases the semantics index of Prog⁺:

forget : ∀ {f} → Prog⁺ m n f → Prog m n
forget (push x) = push x
forget add      = add
forget (p ▷ q)  = forget p ▷ forget q

The soundness of the compiler is then directly a corollary of the recomputation lemma, provided that we redefine the compiler by compile' e = forget (compile⁺ e):

soundness' : (e : Expr) (xs : Stack n) → ⟦ compile' e ⟧ xs ≡ eval e ∷ xs
soundness' e = recomputation (compile⁺ e)

On the surface, the amount of effort for proving soundness' is only slightly less than for proving soundness: the slight difference is due to the fact that recomputation is just a routine induction on the structure of Prog⁺, whereas in soundness we need to deal with the specific combination of instructions used in compile. But in fact, the theory of ornamentation gives us forget and recomputation for free if we properly express the relationship between Prog⁺ and Prog, although that requires datatype-generic programming techniques, which are out of scope for FLOLAC. (The relationship between Prog⁺ and Prog can be inferred for free as well — see Ringer et al’s recent paper Ornaments for Proof Reuse in Coq.) What can be left as an exercise at FLOLAC is probably the following: after showing the general definition of algebraically ornamented lists,

data AlgList (A {B} : Set) (f : A → B → B) (e : B) : B → Set where
  []  : AlgList A f e e
  _∷_ : (x : A) {y : B} → AlgList A f e y → AlgList A f e (f x y)

ask the students to (i) show that vectors are a special case of AlgList, (ii) state and prove the recomputation lemma for AlgList,

forget : {A B : Set} {f : A → B → B} {e : B} {y : B}
       → AlgList A f e y → List A
forget []       = []
forget (x ∷ xs) = x ∷ forget xs

recomputation : {A B : Set} {f : A → B → B} {e : B} {y : B}
              → (xs : AlgList A f e y) → foldr f e (forget xs) ≡ y
recomputation {f = f} []       = refl
recomputation {f = f} (x ∷ xs) = cong (f x) (recomputation xs)

and (iii) derive list append from vector append and use the recomputation lemma for AlgList to prove ++-length for the derived list append.

Agda files: Razor.agda, FLOLAC-Razor.agda (lecture version)


First post with a two-digit number!