- A+

# The setup

Consider a type of terms parameterized over a type of function symbols `node`

and a type of variables `var`

:

`data Term node var = VarTerm !var | FunTerm !node !(Vector (Term node var)) deriving (Eq, Ord, Show) instance Functor (Term node) where fmap f (VarTerm var) = VarTerm (f var) fmap f (FunTerm n cs) = FunTerm n (Vector.map (fmap f) cs) instance Monad (Term node) where pure = VarTerm join (VarTerm term) = term join (FunTerm n cs) = FunTerm n (Vector.map join cs) `

This is a useful type, since we encode open terms with `Term node Var`

, closed terms with `Term node Void`

, and contexts with `Term node ()`

.

The goal is to define a type of substitutions on `Term`

s in the most pleasant possible way. Here's a first stab:

`newtype Substitution (node ∷ Type) (var ∷ Type) = Substitution { fromSubstitution ∷ Map var (Term node var) } deriving (Eq, Ord, Show) `

Now let's define some auxiliary values related to `Substitution`

:

`subst ∷ Substitution node var → Term node var → Term node var subst s (VarTerm var) = fromMaybe (MkVarTerm var) (Map.lookup var (fromSubstitution s)) subst s (FunTerm n ts) = FunTerm n (Vector.map (subst s) ts) identity ∷ Substitution node var identity = Substitution Map.empty -- Laws: -- -- 1. Unitality: -- ∀ s ∷ Substitution n v → s ≡ (s ∘ identity) ≡ (identity ∘ s) -- 2. Associativity: -- ∀ a, b, c ∷ Substitution n v → ((a ∘ b) ∘ c) ≡ (a ∘ (b ∘ c)) -- 3. Monoid action: -- ∀ x, y ∷ Substitution n v → subst (y ∘ x) ≡ (subst y . subst x) (∘) ∷ (Ord var) ⇒ Substitution node var → Substitution node var → Substitution node var s1 ∘ s2 = Substitution (Map.unionWith (λ _ _ → error "this should never happen") (Map.map (subst s1) (fromSubstitution s2)) ((fromSubstitution s1) `Map.difference` (fromSubstitution s2))) `

Clearly, `(Substitution n v, ∘, identity)`

is a monoid (ignoring the `Ord`

constraint on `∘`

) and `(Term n v, subst)`

is a monoid action of `Substitution n v`

.

Now suppose that we want to make this scheme encode substitutions *that change the variable type*. This would look like some type `SubstCat`

that satisfies the module signature below:

`data SubstCat (node ∷ Type) (domainVar ∷ Type) (codomainVar ∷ Type) = … ∷ Type subst ∷ SubstCat node dom cod → Term node dom → Term node cod identity ∷ SubstCat node var var (∘) ∷ (Ord v1, Ord v2, Ord v3) → SubstCat node v2 v3 → SubstCat node v1 v2 → SubstCat node v1 v3 `

This is almost a Haskell `Category`

, but for the `Ord`

constraints on `∘`

. You might think that if `(Substitution n v, ∘, identity)`

was a monoid before, and `subst`

was a monoid action before, then `subst`

should now be a category action, but in point of fact category actions are just functors (in this case, a functor from a subcategory of Hask to another subcategory of Hask).

Now there are some properties we'd hope would be true about `SubstCat`

:

`SubstCat node var Void`

should be the type of ground substitutions.`SubstCat Void var var`

should be the type of flat substitutions.`instance (Eq node, Eq dom, Eq cod) ⇒ Eq (SubstCat node dom cod)`

should exist (as well as similar instances for`Ord`

and`Show`

).- It should be possible to compute the domain variable set, the image term set, and the introduced variable set, given a
`SubstCat node dom cod`

. - The operations I have described should be about as fast/space-efficient as their equivalents in the
`Substitution`

implementation above.

The simplest possible approach to writing `SubstCat`

would be to simply generalize `Substitution`

:

`newtype SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type) = SubstCat { fromSubstCat ∷ Map dom (Term node cod) } deriving (Eq, Ord, Show) `

Unfortunately, this does not work because when we run `subst`

it may be the case that the term we are running substitution on contains variables that are not in the domain of the `Map`

. We could get away with this in `Substitution`

since `dom ~ cod`

, but in `SubstCat`

we have no way to deal with these variables.

My next attempt was to deal with this issue by also including a function of type `dom → cod`

:

`data SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type) = SubstCat !(Map dom (Term node cod)) !(dom → cod) `

This causes a few problems, however. Firstly, since `SubstCat`

now contains a function, it can no longer have `Eq`

, `Ord`

, or `Show`

instances. We cannot simply ignore the `dom → cod`

field when comparing for equality, since the semantics of substitution change depending on its value. Secondly, it is now no longer the case that `SubstCat node var Void`

represents a type of ground substitutions; in fact, `SubstCat node var Void`

is *uninhabited*!

Érdi Gergő suggested on Facebook that I use the following definition:

`newtype SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type) = SubstCat (dom → Term node cod) `

This is certainly a fascinating prospect. There is an obvious category for this type: the Kleisli category given by the `Monad`

instance on `Term node`

. I am not sure if this actually corresponds to the usual notion of substitution composition, however. Unfortunately, this representation cannot have instances for `Eq`

et al. and I suspect it could be very inefficient in practice, since in the best case it will end up being a tower of closures of height `Θ(n)`

, where `n`

is the number of insertions. It also doesn't allow computation of the domain variable set.

# The question

Is there a sensible type for `SubstCat`

that fits my requirements? Can you prove that one does not exist? If I give up having correct instances of `Eq`

, `Ord`

, and `Show`

, is it possible?

There is an obvious category for this type: the Kleisli category given by the Monad instance on Term node. I am not sure if this actually corresponds to the usual notion of substitution composition, however.

It does correspond to that, and it's an all-around correct definition of parallel substitution for well-scoped terms. You may note that this substitution is *total*; this is a requirement if you want term substitution to be total, i. e. that it's a functor from the category of substitutions (**Subst**) to **Set**. For a simple example for why partial substitutions don't work, if you have an empty `SubstCat node () Void`

, then you would have to invent arbitrary closed terms when hitting `VarTerm ()`

in `subst`

- and closed terms don't even exist if you choose `Void`

for `node`

.

Hence, if you have `Map dom (Term node cod)`

then you have partial term substitution, i. e. a functor from **Subst** to the Kleisli category for `Maybe`

(disregarding for now formal complications from the `Ord`

constraints):

`type Subst node i j = Map i (Term node j) subst :: Ord i => Subst node i j -> Term node i -> Maybe (Term node j) subst sub (VarTerm x) = Map.lookup x sub subst sub (FunTerm f ts) = FunTerm f <$> traverse (subst sub) ts `

Now, total term substitution along with your 1-5 desiderata for `SubstCat`

is possible, but not with the current kinds for `SubstCat`

and `Term`

. As I mentioned, in this case substitutions must be total, but currently we can just have `SubstCat node dom cod`

with some infinite `dom`

, which makes equality of substitutions undecidable.

So let's switch a more precise formalization, which contains only what we actually care about here. We have well-scoped untyped terms, and we assume that terms are finite and live in finite variable contexts.

Untyped terms imply that only sizes of variable contexts matter. So, **Subst** has natural numbers as objects:

`data Nat = Z | S Nat deriving (Eq, Show) `

Terms are indexed by `n :: Nat`

, containing `n`

possible values for variables:

`-- {-# language TypeInType, GADTs, TypeFamilies, ScopedTypeVariables, -- FlexibleInstances, MultiParamTypeClasses, RankNTypes, -- TypeApplications, StandaloneDeriving #-} data Fin n where FZ :: Fin (S n) FS :: Fin n -> Fin (S n) deriving instance Eq (Fin n) deriving instance Ord (Fin n) deriving instance Show (Fin n) data Term node (n :: Nat) where VarTerm :: !(Fin n) -> Term node n FunTerm :: !node -> !(Vector (Term node n)) -> Term node n deriving instance Eq node => Eq (Term node n) deriving instance Ord node => Ord (Term node n) deriving instance Show node => Show (Term node n) `

Substitutions (morphisms) are vectors of terms:

`data Vec a n where Nil :: Vec a Z (:>) :: a -> Vec a n -> Vec a (S n) infixr 5 :> deriving instance Eq a => Eq (Vec a n) deriving instance Ord a => Ord (Vec a n) deriving instance Show a => Show (Vec a n) type Subst node i j = Vec (Term node j) i `

Term substitution is as follows:

`subst :: Subst node i j -> Term node i -> Term node j subst (t :> _ ) (VarTerm FZ) = t subst (_ :> ts) (VarTerm (FS x)) = subst ts (VarTerm x) subst sub (FunTerm f ts) = FunTerm f (Vector.map (subst sub) ts) `

Composition is just pointwise `subst`

:

`comp :: Subst node i j -> Subst node j k -> Subst node i k comp Nil sub2 = Nil comp (t :> ts) sub2 = subst sub2 t :> comp ts sub2 `

Identity substitution is not so easy. We need to dispatch on a type-level `Nat`

. For this, we use here a type class; `singletons`

would be a heavier but more principled solution. The implementation itself is also a bit mind-bending. It makes use of the fact that `Subst node n m`

is isomorphic to `(Fin n -> Term node m)`

. In fact, we could just use the functional representation all the time, but then the `Eq, Ord and Show`

instance would still effectively need the backwards conversion, and we would not have the spacetime boundedness guarantees of (strict) vectors.

`class Tabulate n where tabulate :: (Fin n -> Term node m) -> Subst node n m instance Tabulate Z where tabulate f = Nil instance Tabulate n => Tabulate (S n) where tabulate f = f FZ :> tabulate (f . FS) idSubst :: Tabulate n => Subst node n n idSubst = tabulate VarTerm `

That's it! Proofs that `subst`

preserves `idSubst`

and `comp`

are left to the reader.

PS: I note that in literature and formal Agda/Coq developments, `Nat`

indices of `Subst`

are usually in swapped order, and `subst`

has *contravariant* action.