Shin and his summer intern student did a survey of Okasaki’s numerical representations idea for designing concrete sequence representations that can be used (abstractly) as queues, stacks, random-access lists, etc. They wrote a series of Agda programs, where they followed my old recipe of indexing the types of data structures and operations with their underlying numerical structure. It’s all well and good, but is still far from the usual functional correctness that we want for our (for example) queues, namely enforcing the first-in-first-out (FIFO) order, which I’ll call ‘queue order’ for short. I had a hunch that parametricity could lead to a very clean specification of queue order, and a bit of experiment showed that hunch is probably correct, although I didn’t expect that most of the techniques used turned out to be old and familiar ones.
The first non-trivial example of a queue implementation is usually this two-list representation:
module _ (A : Set) where
State : Set
State = List A × List A
empty : State
empty = [] , []
push : A → State → State
push x (front , rear) = front , x ∷ rear
pop : State → Maybe (A × State)
pop ([] , rear) with reverse rear
pop ([] , rear) | [] = nothing
pop ([] , rear) | x ∷ xs = just (x , (xs , []))
pop (x ∷ front , rear) = just (x , (front , rear))
The simple types do not prevent the programmer from writing, for example,
wrongPop : State → Maybe (A × State)
wrongPop (front , rear) with reverse rear
wrongPop (front , rear) | [] = nothing
-- forget the front elements
wrongPop (front , rear) | x ∷ xs = just (x , (front ++ xs , []))
-- wrong order
where elements are accidentally forgotten or reordered. At this point people usually think of substructural type systems, but that implies drastic changes to the language, which I want to avoid.
Another thing that people might think of is ordered data structures, which are standard examples of dependently typed programming.
In particular, Conor McBride wrote an ICFP 2014 paper on these, where, for example, the datatype of ordered lists is parametrised by both an element type A : Set
and an ordering R : A → A → Set
, and is indexed by a lower bound and an upper bound:
data OList (A : Set) (R : A → A → Set) : A → A → Set where
[] : {{R l u}} → OList A R l u
_∷_ : (x : A) → {{R l x}} → OList A R x u → OList A R l u
An inhabitant of OList A _≤_ l u
looks like l ≤ x₀ ≤ x₁ ≤ x₂ ≤ x₃ ≤ u
, where ordering proofs are put between adjacent elements xᵢ
and bounds.
(These ordering proofs are declared as instance arguments so that the proofs can be automatically dealt with when defining functions on OList
— we’ll see examples later.)
This might not appear too relevant because queue order isn’t about ordering the elements themselves.
But I realised that queue ordering is actually a special case:
Seq : Set → ℕ → ℕ → Set
Seq A l u = OList (Maybe A × ℕ) (λ (_ , t) (_ , t') → t < t')
(nothing , l) (nothing , u)
That is, just use an element type like Maybe A × ℕ
where the second component records the time when the element is inserted into the queue, and order the elements by their time of insertion; the two bounds are not real elements but can still be assigned a time of insertion, hence the Maybe A
.
So theoretically OList
suffices, although we could achieve a cleaner separation of elements and their insertion time by putting the latter at the index level:
data Seq {Time : Set} (P : Time → Set) (R : Time → Time → Set)
: Time → Time → Set where
[] : {{R l u}} → Seq P R l u
_∷_ : P m → {{R l m}} → Seq P R m u → Seq P R l u
This was actually the first idea that came to me, but of course OList
is easier, so we’ll just work with OList
below.
Conor’s OList
can lead to awkward function definitions though.
For example, to write the usual list append, we’d have to write some proofs manually:
_++_ : OList A R l m → OList A R m u → OList A R l u
[] ++ ys = {! …, _ : R l m, ys : OList A R m u ⊢ OList A R l u !}
(x ∷ xs) ++ ys = x ∷ (xs ++ ys)
In the inductive case, the proofs are automatically dealt with by instance resolution.
But in the base case, we should produce an OList A R l u
, whereas ys
has a different type OList A R m u
.
We have a proof of R l m
though (which is the instance argument of the []
constructor), so with some effort we should be able to fuse that proof into ys
and fulfil the goal.
Conor discussed this situation in their paper; they insisted on writing no proofs, and argued that the natural function to write is the following one, which puts m
into the output list as well:
append' : (m : A) → OList A R l m → OList A R m u → OList A R l u
append' m [] ys = m ∷ ys
append' m (x ∷ xs) ys = x ∷ append' m xs ys
Intuitively, the two input lists look like l ≤ x₀ ≤ x₁ ≤ m
and m ≤ x₂ ≤ x₃ ≤ u
.
If we append them in the usual way, we’d have to compress the middle x₁ ≤ m ≤ x₂
into x₁ ≤ x₂
, which requires transitivity; but if we also put m
into the output list, then no proving is required.
Luckily, append'
sufficed for Conor’s use case, but the problem remains: with this representation, there are ordinary functions such as list append that we cannot define without writing proofs.
We’ll fix this problem below by using a different representation.
But before we get to that, there are lessons to be learned here: given the type OList A R l m → OList A R m u → OList A R l u
, if we use/assume no laws of R
, we are pretty much forced to write append'
, so the type acts as a pretty tight specification.
How tight?
We can do a bit of experiment with a simpler type: how many inhabitants does the type OList A R l u → OList A R l u
have?
Let’s try and write a function of this type:
f : OList A R l u → OList A R l u
f [] = []
f (x ∷ []) = {! …, _ : R l x, _ : R x u ⊢ OList A R l u !}
f (x ∷ y ∷ xs) = {! …, _ : R l x, _ : R x y ⊢ OList A R l u !}
The nil case has no other possibilities.
In the singleton case, we cannot delete x
and return []
because that would require us to prove R l u
, which requires transitivity (which we don’t assume); and we cannot duplicate x
and return x ∷ x ∷ []
because that would require us to prove R x x
, which requires reflexivity; so we have no choice but return x ∷ []
.
And of course, in the third case we cannot reorder x
and y
, which requires symmetry in particular.
So f
has to be the identity function.
Intuitively, the elements of an input list of type OList A R l u
are linked together by proofs of R
, and since R
is parametrically quantified and has no laws, in the output list of the same type OList A R l u
we can only use the existing proofs to link the output elements, and the only way of forming links from l
to u
is going through exactly the same elements and proofs in the input list.
If R
has laws, then we’re allowed to do more: transitivity allows deletion, reflexivity allows (adjacent) copying, and symmetry allows reordering (with help from other laws).
So there seems to be some kind of correspondence with substructural type systems, which may be interesting to explore — some time later.
Comment.
Zhixuan commented that (as opposed to using a substructural type system) I’m directly working with the monoidal structure on A → A → Set
where the monoidal product is (R □ S) a b = Σ[ c ∈ A ] R a c × S c b
, with no symmetry, projection, or duplication.
I think this is probably an important change of perspective: think about and work with R
-proofs, and get intended behaviour on A
-elements as a side effect.
(End of comment.)
Getting back to the problem of awkward definitions, the solution that came to me is
data OList (A : Set) (R : A → A → Set) : A → A → Set where
[] : OList A R u u
_∷_ : (x : A) → {{R l x}} → {{R x m}} → OList A R m u → OList A R l u
An inhabitant of this new OList
datatype looks like l ≤ x₀ ≤ m₀ ≤ x₁ ≤ m₁ ≤ x₂ ≤ m₂ ≤ x₃ ≤ u
, where every element xᵢ
is sandwiched between two separators (including the two bounds).
This representation works nicely for list append:
_++_ : OList A R l m → OList A R m u → OList A R l u
[] ++ ys = ys
(x ∷ xs) ++ ys = x ∷ (xs ++ ys)
If the two input lists are l ≤ x₀ ≤ m₀ ≤ x₁ ≤ m
and m ≤ x₂ ≤ m₂ ≤ x₃ ≤ u
, then there’s no problem producing l ≤ x₀ ≤ m₀ ≤ x₁ ≤ m ≤ x₂ ≤ m₂ ≤ x₃ ≤ u
, where the mediating index m
simply becomes the separator between x₁
and x₂
.
We can also define the directly recursive and slow list reverse, whose type specifies that the function should map a chain of elements from l
to u
to a flipped chain from u
to l
:
slowReverse : OList A R l u → OList A (flip R) u l
slowReverse [] = []
slowReverse (x ∷ xs) = slowReverse xs ++ (x ∷ [])
(This happens to be naturally definable on Conor’s representation too, using append'
.)
The tail-recursive version revcat
can be defined too, with the type describing the loop invariant that an input chain of elements from l
to u
is divided into two, where an initial part has already been reversed and goes from some m
in the middle to l
, and the remaining part goes from m
to u
:
revcat : OList A R m u → OList A (flip R) m l → OList A (flip R) u l
revcat [] ys = ys
revcat (x ∷ xs) ys = revcat xs (x ∷ ys)
reverse : OList A R l u → OList A (flip R) u l
reverse xs = revcat xs []
(I believe this tail-recursive version is awkward to define on Conor’s representation.)
Along the same lines, we can implement two-list queues, with the type of push
(or pop
) saying that an element is pushed (or popped) along with the two links/proofs around it:
module TwoListQueue (A : Set) (R : A → A → Set) where
record State (l u : A) : Set where
constructor _,_
field
{mid} : A
front : OList A R l mid
rear : OList A (flip R) u mid
empty : State u u
empty = [] , []
push : (x : A) → {{R u x}} → {{R x u'}} → State l u → State l u'
push x (front , rear) = front , x ∷ rear
data OListF (X : A → A → Set) : A → A → Set where
nil : OListF X u u
cons : (x : A) → {{R l x}} → {{R x m}} → X m u → OListF X l u
pop : State l u → OListF State l u
pop ([] , rear) with reverse rear
pop ([] , rear) | [] = nil
pop ([] , rear) | x ∷ xs = cons x (xs , [])
pop (x ∷ front , rear) = cons x (front , rear)
All the function definitions are natural and, intuitively, have to be correct due to the precise types — for example, wrongPop
would not type-check.
And when we use the queue, we don’t actually need to find a non-trivial element ordering — just use Maybe A
as the element type (so that we can supply nothing
as separators) and specialise R
to λ _ _ → ⊤
, and the queue will essentially revert to the simply typed version, but guaranteed to work correctly.
Of course, formally we need parametricity to prove the correctness claims. Here I’ll just do a simplest exercise: prove that the type of list reverse
ReverseType : Set₁
ReverseType = {A : Set} {R : A → A → Set} {l u : A}
→ OList A R l u → OList A (flip R) u l
is uniquely inhabited (up to extensional equality), demonstrating that the type is a tight specification.
Unique inhabitance doesn’t directly say how a function of this ReverseType
behaves, but we know that, for example, reverse
(or slowReverse
) inhabits this type, so anything we know about the extensional behaviour of reverse
will hold for all other functions of the same type.
There should be other kinds of property that we can derive from the type, but it’s hard to tell which one is the most useful without knowing the specific scenario.
My choice of unique inhabitance has the slight advantage of being generic and therefore having more potential for generalisation to other types.
In this post I’ll just give a proof sketch; a formal proof can be found in the Agda file attached at the end (which will be my ‘slides’ for talking about this at the upcoming WG 2.1 Meeting in Viana do Castelo, Portugal).
I learned the proof idea from Janis Voigtländer’s Bidirectionalization for free!, although I’m sure similar ideas are used elsewhere too.
Long story short: it’s about determining the behaviour of a function f : ∀ {A} → List A → List A
from one of its concrete instances.
Given xs : List A
, we
xs
with a unique ID, say a natural number,ns : List ℕ
, andf {ℕ} ns
, the output IDs, which have to be arranged in the same way as f {A} xs
due to parametricity.More formally, the associations can be represented as a function assoc : ℕ → A
such that map assoc ns ≡ xs
, and parametricity (free theorem) will give us
f {A} xs ≡ f {A} (map assoc ns) ≡ map assoc (f {ℕ} ns).
In the case of OList A R
, just think of A
and R
as two kinds of element type, and the idea still largely works: for any rev
, rev' : ReverseType
and xs : OList A R l u
, find a list ns
and functions assocA
and assocR
such that mapAR assocA assocR ns ≡ xs
(where mapAR
operates on both element types), and then
rev xs
≡ {- specification of ns -}
rev (mapAR assocA assocR ns)
≡ {- free theorem -}
mapAR assocA (flip assocR) (rev ns)
≡ {- see below -}
mapAR assocA (flip assocR) (rev' ns)
≡ {- free theorem -}
rev' (mapAR assocA assocR ns)
≡ {- specification of ns -}
rev' xs
In the middle we need rev ns
and rev' ns
to be equal: when constructing ns
, if we choose concrete types like A = ℕ
and R x y = (suc x ≡ y)
, requiring a chain of natural numbers linked by R
to be consecutive (and proofs of R x y
are unique since ℕ
is an h-set), then the type OList A R l u
will fully determine the elements in its inhabitant, so rev ns
and rev' ns
are equal simply because they have the same type.
The actual Agda proof needs to be more precise, for example making assocA
total by defining a finite domain (instead of ℕ
) whose size depends on the length of xs
, but the proof is essentially what’s sketched above.
I think the reasoning pattern used here is that our intuition about how the function processes elements of abstract types A
and R
is faithfully captured in the behaviour of the function at the chosen concrete types, which we can determine because everything is concrete, and then parametricity says that the function must behave uniformly for all types, so our intuition is valid in general.
I wonder if we can turn this reasoning pattern into some form of reusable theorem so that in the future we can simply say that unique inhabitance of ReverseType
(or, for example, of the type of retabulate
in the binomial tabulation development) is a corollary of the theorem (otherwise it’s not really science…).
I believe that the types of empty
, push
, and pop
can all be proved to be uniquely inhabited in the same way.
On the other hand, as can be seen above, I don’t think there will be much new stuff in these proofs.
A more interesting task would be proving that the abstract queue type
record Queue (A : Set) (R : A → A → Set) : Set₁ where
field
State : A → A → Set
empty : State u u
push : (x : A) → {{R u x}} → {{R x u'}} → State l u → State l u'
pop : State l u → OListF A R State l u
or more precisely, the parametrically quantified type ∀ {A R} → Queue A R
(so that parametricity can help), is uniquely inhabited up to bisimilarity, so that we can prove the correctness of all type-checked implementations once and for all: since the type is inhabited by, for example, this simplest one-list implementation, on which we can easily prove extensional properties we want,
oneListQueue : ∀ {A R} → Queue A R
oneListQueue {A} {R} = record
{ State = OList A R
; empty = []
; push = λ x xs → xs ++ (x ∷ [])
; pop = λ { [] → nil
; (x ∷ xs) → cons x xs } }
we’ll know that any other implementation, for example the two-list queue
twoListQueue : ∀ {A R} → Queue A R
twoListQueue {A} {R} = record
{ State = State
; empty = empty
; push = push
; pop = pop }
where open TwoListQueue A R
has the same extensional behaviour and therefore satisfies the same properties. I think I have some vague ideas, and will try to devise a proof and write another post. And we should think about whether we can specify other kinds of ordering too — stack order and priority-queue order, for example. Priority-queue order is probably difficult; even stack order may not be as straightforward as it looks…
Agda file: WG21-Sep25.agda