The trek goes on

0038

Always inhabited binomial trees

Posted at 16:10 on 5 June 2024

Continuing 0037, one immediately noticeable difference between upgrade, which works on the Forall-sublists representation, and retabulate, which is the name we use in the short-story paper for the version that works on the BT datatype, is that upgrade doesn’t need a side condition k < n. This side condition helps to exclude the case where the input has shape BT n n, i.e., a tip; this is the top level of the sublist lattice, above which there is no level, and correspondingly the output shape BT n (suc n) is uninhabited. However, the Forall-sublists representation is always inhabited because universal quantification over an empty domain (such as ‘for all 1-sublists of an empty list...’) is vacuously true. If we are looking for a closer correspondence between the two representations, then BT n k should get a nil constructor when n < k, making BT n k always inhabited but possibly with no elements. (That is, all the upper levels are actually still there, but invisible.)

The always inhabited version of BT is

data BT : (n k : ℕ) → (Vec a k → Set) → Vec a n → Set where
  tipZ : p []                        → BT n        zero   p xs
  nil  : .{{n < suc k}}              → BT n       (suc k) p xs
  tipS : p xs                        → BT (suc k) (suc k) p xs
  bin  : .{{n > k}}
       → BT n (suc k) p           xs
       → BT n      k (p ∘ (x ∷_)) xs → BT (suc n) (suc k) p (x ∷ xs)

This datatype definition can be seen as being translated from the following definition of choose that computes a sublist nondeterministically (using List and its Alternative combinators to represent the nondeterminism effect):

choose : (n k : ℕ) → Vec a n → List (Vec a k)
choose n        zero   xs       = pure []
choose n       (suc k) xs       with <-cmp n (suc k)
choose n       (suc k) xs       | tri< n<sk _ _      = empty
choose (suc k) (suc k) xs       | tri≈ _ refl _      = pure xs
choose (suc n) (suc k) (x ∷ xs) | tri> _ _ (s≤s n>k) = choose n (suc k) xs
                                             <|> ((x ∷_) <$> choose n k xs)

Notably, empty translates to the nil constructor, which can be used when we’re choosing more elements than there are, as witnessed by the inequality proof n < suc k; this inequality proof is produced by the with-matching in choose, and turns into an argument of the constructor. Similarly, bin also gets an inequality argument, which makes the indices determine the constructor more directly than the original BT.

Since the indices determine the constructor, the possible cases of zipBTWith are exactly those with matching constructors:

zipBTWith : (∀ {ys} → p ys → q ys → r ys)
          →  ∀ {xs} → BT n k p xs → BT n k q xs → BT n k r xs
zipBTWith f (tipZ z)   (tipZ w)   = tipZ (f z w)
zipBTWith f  nil        nil       = nil
zipBTWith f (tipS z)   (tipS w)   = tipS (f z w)
zipBTWith f (bin t t') (bin u u') = bin (zipBTWith f t  u )
                                        (zipBTWith f t' u')
-- impossible cases
zipBTWith f  nil       (tipS _)   = ⊥-elim (<-irrefl refl it)
zipBTWith f  nil       (bin _ _)  = ⊥-elim (≤⇒≯ (<⇒≤ (≤-pred it)) it)
zipBTWith f (tipS _)    nil       = ⊥-elim (<-irrefl refl it)
zipBTWith f (tipS _)   (bin _ _)  = ⊥-elim (1+n≰n it)
zipBTWith f (bin _ _)   nil       = ⊥-elim (≤⇒≯ (<⇒≤ it) (≤-pred it))
zipBTWith f (bin _ _)  (tipS _)   = ⊥-elim (1+n≰n it)

Thanks to the use of instance arguments, there is no need to supply the inequality proofs in the nil and bin cases. There is a minor technical complication though: instance resolution requires an instance to be the unique one with the right type in the context, but for example in the nil case, the inequality proof required by the nil on the right-hand side can come from the first nil on the left-hand side or the second. Luckily, by marking the inequality argument of nil as irrelevant, Agda is fine with having more than one type-correct instance in the context. This looks good at first, but if we think a bit more about this use of instance arguments, here they are really used as a poor man’s proof automation to deal with inequality proofs, which scales badly — for example, all the impossible cases need to be refuted by manually writing inequality proofs, which should be trivial to an SMT solver. Maybe it’s time to seriously integrate SMT solving into dependently typed languages so that, for example, the definition of zipBTWith in the surface language consists of just the first four clauses, and then the elaborator takes the certificates produced by the SMT solver and reconstructs the proofs in the impossible cases, which the core still sees and checks.

Moving on to retabulate without a side condition:

_∷ᴮᵀ_ : p xs → BT (1 + k) k (p ∘ (x ∷_)) xs → BT (2 + k) (1 + k) p (x ∷ xs)
y ∷ᴮᵀ t = bin (tipS y) t

retabulate : BT n k p xs → BT n (1 + k) (BT (1 + k) k p) xs
retabulate                   nil           = nil {{m≤n⇒m≤1+n it}}
retabulate {xs = []       } (tipZ y)       = nil
retabulate {xs = _ ∷ []   } (tipZ y)       = tipS (tipZ y)
retabulate {xs = _ ∷ _ ∷ _} (tipZ y)       = bin {{s≤s z≤n}} (retabulate (tipZ y)) (tipZ (tipZ y))
retabulate                  (tipS y)       = nil
retabulate (bin   (tipS y)    u          ) = tipS (y ∷ᴮᵀ u)
retabulate (bin t@(bin _ _)     (tipZ z) ) = bin {{s≤s it}} (retabulate t) (mapBT (_∷ᴮᵀ (tipZ z)) t)
retabulate (bin t@(bin _ _)   u@(bin _ _)) = bin {{s≤s it}} (retabulate t) (zipBTWith _∷ᴮᵀ_ t (retabulate u))
-- impossible cases
retabulate (bin   (nil {{l}}) u          ) = ⊥-elim (1+n≰n (≤-trans l it))
retabulate (bin t@(bin _ _)     (tipS _) ) = ⊥-elim (1+n≰n it)
retabulate (bin t@(bin _ _)      nil     ) = ⊥-elim (≤⇒≯ (<⇒≤ it) it)

We need a few new cases to deal with nil, but overall it’s straightforward to finish the definition especially when we know the original one — as before, intrinsic typing helps to avoid most of the traditional proofs. (If the idea of integrating SMT solving was implemented, we would be able to skip all the inequality proofs too, and the program would look almost like a simply typed one.)

I talked about this in a group meeting, where we then continued to discuss the idea of converting between a datatype D and a family P of datatypes of paths pointing to sub-structures in the inhabitants of D, which I argued to be closely related to differentiation and integration. The differentiation direction from D to P has been known for some time, but the integration direction from P to D seems to be unexplored. If we want to make datatype integration a general theory, then we should be able to integrate any datatype. In a simply typed setting, the theory can quickly go ‘out of bounds’. For example, if natural numbers are regarded as paths in an inhabitant of some datatype, what should that datatype be? Well, it’s not actually a datatype but the codatatype of infinite lists — the possible element positions in an infinite list are all the natural numbers. And what should the result of integrating the codatatype of infinite lists be?

And, if any datatype can be integrated, what about those with inhabitants that don’t look like paths, such as binary trees? The widely accepted definition of datatype derivatives does not produce binary trees, mainly due to the rule for differentiating a product of functors: $$ \partial_x (S \times T) \mathrel\mapsto \partial_x S \times T + S \times \partial_x T $$ Since this rule is the same with the one in infinitesimal calculus, (I think) it’s never questioned in the literature, although it’s never been explained why the two rules have the same form either. Now I think it’s merely a coincidence: Conor wanted one-hole contexts, so the rule digs a hole in either $S$ or $T$, but if we don’t insist on having just one hole, then we should be allowed to dig holes in both $S$ and $T$ — that is, in general the rule can have a third summand $\partial_x S \times \partial_x T$. And the third summand is indeed there if we look into the lesser-known finite calculus (covered, for example, in Section 2.6 of Graham, Knuth, and Patashnik’s Concrete Mathematics), where the rule for differentiating a product of functions is $$ \Delta (f \times g) = \Delta f \times g + f \times \Delta g + \Delta f \times \Delta g $$ Intuitively: a change in $f \times g$ can result from a change in $f$, a change in $g$, or a change in both $f$ and $g$. (And we can give a geometric argument by visualising $f \times g$ as the area of a rectangle whose width is $f$ and whose height is $g$.) This is in a more direct correspondence with the datatype story, where things are finite (and there is no role for infinitesimals). Back to the question of integrating the datatype of binary trees, I imagine the result should be the codatatype of infinite binary trees, although the notion of path should be changed to ‘fringe’ in the sense that a fringe for an infinite binary tree is a finite binary tree that indicates a finite portion of the infinite tree to be retained, while all the infinite sub-trees outside the fringe are left out — that is, we’re allowed to use only the third summand of the product differentiation rule. This suggests that we may need to consider different modes of differentiation, which do sound reasonable for a general theory (but also make things complicated).

But things get much more interesting when we start to think about indexed datatypes too. Shin suggested looking again at the sublist relation:

data _⊆_ {a : Set} : List a → List a → Set where
  nil  :               [] ⊆ []
  keep : ys ⊆ xs → x ∷ ys ⊆ x ∷ xs
  drop : ys ⊆ xs →     ys ⊆ x ∷ xs

We can integrate this datatype in two ways. If for every xs we want to collect elements of all the types p ys such that ys ⊆ xs, we get a BT-like datatype that tabulates all the sublists of xs:

data BT' : (List a → Set) → List a → Set where
  tip : p []                → BT' p []
  bin : BT' (p ∘ (x ∷_)) xs
      → BT'  p           xs → BT' p (x ∷ xs)

Conversely, if for every ys we want elements of all the types p xs such that ys ⊆ xs, we get a codatatype that tabulates all the superlists of ys, of which there are infinitely many:

mutual

  record ST (p : List a → Set) (ys : List a) : Set where
    coinductive
    field
      decon : STF p ys

  data STF : (List a → Set) → List a → Set where
    nil  : p []
         → ST (p ∘ (x ∷_)) []        → STF p []
    cons : ST (p ∘ (y ∷_))      ys
         → ST (p ∘ (x ∷_)) (y ∷ ys)  → STF p (y ∷ ys)

To derive this codatatype we need to read the sublist relation from left to right: if ys is empty, then in the sublist relation we can choose nil or drop, which translate to the two arguments of nil in STF; otherwise, we can choose keep or drop, which translate to cons in STF.

All these are pointing to a general theory, but we can probably only hope to develop a very specialised form first, more for the purpose of delivering a preliminary form of the idea. Both definitions of _⊆_ and BT' stem from the sublist computation, which might be described by

sublist : List a → List (List a)
sublist []       = pure []
sublist (x ∷ xs) = ((x ∷_) <$> sublist xs) <|> sublist xs

From (a suitable representation of) sublist, we should be able to derive BT' p xs and _⊆_, which is the ‘position’ component in the (indexed) container representation of BT' (that maps positions to elements): ∀ ys → ys ⊆ xs → p ys. Naturally, there should be an isomorphism between the aggregate datatype BT' and its container representation, but this is only half of the story. What’s interesting is that we should be able to specify some computation in terms of the container representation and then implement the computation on the aggregate datatype. For this to be interesting, the specification should be clean and easy to write, while the implementation should exploit the fact that we are computing all the results to improve efficiency, for example by dynamic programming or incremental techniques. If it’s too difficult to derive the implementation, at least there should be a framework similar to the fundamental theorem of (finite) calculus that simplifies the verification.

Agda file: AlwaysInhabited.agda