The trek goes on

0037

Binomial trees as containers

Posted at 23:07 on 24 April 2024

Besides being trapped on an ICE train for an hour (without a seat), my journey to the IFIP Working Group 2.1 meeting in Neustadt, Germany and ETAPS in Luxembourg was smooth. I’m getting even more uncomfortable with big conferences like ETAPS, but the size of WG 2.1 meetings makes them more tolerable and sometimes even enjoyable — I’ll see if I can adapt to the latter.

At the 2.1 meeting, I talked about my dependently typed reformulation of Richard’s programs for top-down and bottom-up algorithms (which Shin dealt with using equational reasoning in the previous Oxford meeting), mainly showing the BT datatype,

data BT : (n k : ℕ) → (Vec A k → Set) → Vec A n → Set where
  tipZ : P []                                → BT n         zero    P xs
  tipS : P xs                                → BT (succ k) (succ k) P xs
  bin  : BT n (succ k)        P           xs
       → BT n       k (λ zs → P (x ∷ zs)) xs → BT (succ n) (succ k) P (x ∷ xs)

and stopped with the observation that induction principles are propositions (no string diagrams). It occurred to me that I could say that BT n k P xs is an alternative representation of quantification over k-sublists of xs, and explicitly write the latter as

(ys : k-sublist of xs) → P ys

where ‘-sublist of’ is informal. Later James McKinna suggested that I try a functional representation, which I realised could be thought of as a formal version of the function types above. All I need to do is formalise the sublist relation, which is straightforward:

data _⊆_ {A : Set} : {n k : ℕ} → Vec A k → Vec A n → Set where
  nil  :               [] ⊆ []
  keep : ys ⊆ xs → x ∷ ys ⊆ x ∷ xs
  drop : ys ⊆ xs →     ys ⊆ x ∷ xs

And then I can write the quantification as

Forall_-sublistsOf_⇒_ : ({n} k : ℕ) → Vec A n → (Vec A k → Set) → Set
Forall k -sublistsOf xs ⇒ P = {ys : Vec _ k} → ys ⊆ xs → P ys

Now, in place of BT, I can switch to this functional representation everywhere in the development. For example, the induction principle can be rewritten as

ImmSubInd : Set₁
ImmSubInd = {A : Set} (P : ∀ {k} → Vec A k → Set)
          → P []
          → (∀ {k} {ys : Vec A (succ k)}
             → (Forall k -sublistsOf ys ⇒ P) → P ys)
          → ∀ {n} (xs : Vec A n) → P xs

where the inductive case uses the newly defined quantification over sublists as the type of the induction hypotheses — if all the k-sublists of ys : Vec A (succ k) (that is, all the immediate sublists of ys) satisfy P, then ys itself should satisfy P. Both the top-down and bottom-up algorithms can be easily adapted to have this type. But what’s more dramatic is that the definition of upgrade is simplified to a one-liner:

upgrade :
    (Forall k -sublistsOf xs ⇒ P)
  → (Forall succ k -sublistsOf xs ⇒ λ ys → Forall k -sublistsOf ys ⇒ P)
upgrade t {ys} ys⊆xs {zs} zs⊆ys = t (⊆-trans zs⊆ys ys⊆xs)

For every (succ k)-sublist ys of xs and k-sublist zs of ys, upgrade needs to produce P zs; by the transitivity of _⊆_,

⊆-trans : {xs : Vec A n} {ys : Vec A m} {zs : Vec A k}
        → zs ⊆ ys → ys ⊆ xs → zs ⊆ xs

zs is easily shown to be a sublist of xs, so we can just look it up in the input t —which consists of P-values for every sublist of xs— and get P zs. As simple as that!

Obviously, a comparison between the two representations is needed. Eventually I saw that the Forall-sublists quantification is essentially a container representation of BT, mapping every position in a BT-tree to the element at that position: A position in a tree should be a path from the root to an element, and indeed an inhabitant of the sublist relation can be interpreted as instructions to go left (drop), go right (keep), or stop (nil) when traversing a BT-tree. These inhabitants do not strictly correspond to the paths in a BT-tree, but when converting a BT to a Forall-sublists quantification, at bin nodes we do go into left or right subtrees as instructed:

fromBT : BT n k P xs → Forall k -sublistsOf xs ⇒ P
fromBT (tipZ p)       nil     = p
fromBT (tipS p)      (keep s) = subst _ (sym (⊆⇒≡ s)) p
  -- ⊆⇒≡ : {xs ys : Vec A n} → ys ⊆ xs → ys ≡ xs
fromBT (bin t u)     (keep s) = fromBT u s  -- GOING LEFT
fromBT (tipZ p) {[]} (drop s) = p
fromBT (tipS p)      (drop s) = ⊥-elim ((1+n≰n (⊆-length s)))
  -- ⊆-length : {xs : Vec A n} {ys : Vec A m} → ys ⊆ xs → m ≤ n
fromBT (bin t u)     (drop s) = fromBT t s  -- GOING RIGHT

It is then easy to see that upgrade is essentially a container morphism, which specifies for every position in the output data structure where the element comes from the input data structure. What’s drastically different about the container morphism version seems to be its highly local nature: the shape of the output data structure is already fixed, and it suffices to decide locally what each element should be, which is simpler to think about than globally transforming trees with a complex structure and a whole lot of elements.

Despite the simplicity, Shin suspected that the functional representation is not efficient, and works only as a prototype or a specification, which is then refined to BT. I noted that deriving a datatype of paths (which is roughly the sublist relation) from a datatype of trees is something I’ve done (in the NDGP paper); conversely, I’ve conjectured that it’s possible to derive BT from a nondeterministic version of choose, of which the sublist relation can be thought of as an alternative form. Now a challenging problem is to also derive Richard’s upgrade from the container morphism version.

Moreover, since deriving path datatypes is formally related to differentiation, I suspect that we’re looking at the converse problem of datatype integration: knowing all possible paths, derive a datatype containing all those paths (and no more). The fundamental theorem of calculus seems to make sense too: if we compute a datatype of paths for the derived datatype, we should get back to the original datatype, and vice versa — it’s really a lossless change between the local and global perspectives. And deriving upgrade is like solving a differential/difference equation (again by integration): deducing global behaviour (tree computation) from local behaviour (element computation).

One requirement about integration that’s usually not explicitly mentioned is that the result of an integral should best be a closed form and easy to compute: for example, $\int x^2\,dx$ gives $\frac{1}{3}x^3$ rather than something involving sums, limits, etc. Results of datatype and program integration should be something nice too: conceptually, integration is just collecting locally computed results into a data structure; if there’s no advantage working with the aggregate data structure than just computing all the local results, the integration does nothing interesting. One possible advantage is that we can deal with overlapping subproblems more efficiently, so we’re back to dynamic programming, except that this time we may be able to view it from a calculus (of differences and sums) perspective.

For now, all these are science fiction. To turn these into real science, I think I’ll need more examples first.

Agda file: BT-TwoPointOne.agda