This blog post will carry on from the previous one, and introduce dependent types. So what is a dependent type? To motivate the idea let’s talk about equality. Remember that we interpret propositions as types, so if we have x, y : A then the statement “x is equal to y” corresponds to some type, let’s call it x =_A y. This type depends on its values, for example we expect to be able to prove (i.e. construct) 3 =_{\mathbb{N}} 3, but not to be able to prove 2 =_{\mathbb{N}} 3 and so we will have an equality type that depends on its values. This idea is also being explored in various programming languages. These languages have a type like \mathrm{Vec}(x, A), where l : \mathrm{Vec}(x, A) means that l is a list of x elements from the type A. Since the length of the list is part of its type which is known ahead of time, it is impossible to ask questions like, “What is the first element of this empty list?” Indeed dependent types are so powerful that one can write a compiler and be sure that the compiler preserves the meaning of a program. Such generalisations of type systems are useful in programming, they allow us to structure information in more general and precise ways. Another reason dependent type theory is interesting is its elegance. Dependent type theory allows values to become part of types, and makes every type a value thereby uniting both notions.

The Universe

The first question we need to ask is, “What is the type of a type?” The reason this is important is that we want to be able to speak of functions which return a type. For example, we might have the type \mathrm{List}(A), which is the type of finite lists of elements from A. So \mathrm{List}, is a function that takes a type as input and returns one as output, and since every function requires an input type and an output type we need to be able to answer that question. So, the type of a type is called “the Universe,” and is denoted \mathcal{U}. The thing is, we have to be careful about this self-reference. For example, in Set Theory when we give our selves the ability to make “a set of all sets that satisfy some property,” we get a contradiction called Russel’s paradox. In a similar way we get a contradiction if we just say that every type’s has the type \mathcal{U}, which would imply that \mathcal{U} : \mathcal{U} and this can be used to derive a paradox of self reference. However, the proof of this isn’t as elegantly intuitive as Russel’s paradox. So instead we postulate an infinite hierarchy of universes, where we may have for example A : \mathcal{U}_0, and \mathcal{U}_0 : \mathcal{U}_1, and every universe is a member of a universe one step larger than it. But we will just sweep this complexity under the rug and say things like A : \mathcal{U} leaving out the size of the universe, there are formal ways to deal with such ambiguity but they are complicated and we have a different path.

Already we have added quite a lot of power to our programming language. For example, consider the type \mathrm{Vec}(n, A) of finite vectors of n elements each of the type A. We may define it by recursion like this:

\begin{aligned} \mathrm{Vec} &: \mathbb{N} \to \mathcal{U} \to \mathcal{U} \\ \mathrm{Vec}(0, A) &:\equiv \mathbf{1} \\ \mathrm{Vec}(\mathrm{succ}(n), A) &:\equiv A \times \mathrm{Vec}(n, A) \end{aligned}

Or in mathematical English, we define the type of vectors with zero length as \mathbf{1}. Why does this make sense? A vector of length zero contains no elements of A, and yet one should be able to trivially construct an empty vector without specifying any information. So, we define a vector of length 0 in A to be an element of \mathbf{1} which can be trivially constructed.

The next definition says that a vector of length \mathrm{succ}(n), in A is the same as an element of A paired with a vector of length n in A.

Dependent Functions

Dependent functions are like regular functions A \to B, except that the type of the output may depend on the input. For example, what would be the type of the function that concatenates the vectors that we defined earlier? Well the function takes any n_1, n_2 : \mathbb{N} which are the lengths of the vectors, one value with the type \mathrm{Vec}(n_1, A) and another value with the type \mathrm{Vec}(n_2, A) and returns a value with the type \mathrm{Vec}(n_1 + n_2, A) since when you concatenate two vector the length of the result is the sum of the lengths of each vector. So the type of output returned by the function depends on the input.

Formation Rule So when can you talk about having a dependent function? Well you need an input type A, and a type family P : A \to \mathcal{U} where if you put in x : A, then the output will have the type P(x). Given these two pieces of data we have the type \prod_{x : A} P(x) : \mathcal{U}, of dependent functions which take x : A as input and return f(x) : P(x) as output. This type is pronounced “pi x in A, P of x,” or we may speak of a function which “takes (any) x : A as input and returns a P(x) as output.”

For example, let’s assume that you have a type A and an element x : A. Consider the function \mathrm{repeat}, which takes a number n : \mathbb{N} and gives you back a list with n copies of x, so for example when we put 3 we get back [x, x, x]. Given those assumption, what type will \mathrm{repeat} have, well if we implement it correctly we will have \mathrm{repeat} : \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A) which means that \mathrm{repeat} takes a number n and outputs a vector of length n with elements from A. The formation rule allows us to say that \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A) is a type, since \mathrm{Vec}(-, A) is a type family over \mathbb{N}, since if you plug in a natural number n : \mathbb{N} you get a type \mathrm{Vec}(n, A). From now on however, I will leave out formation rules they are pretty obvious most of the time.

Introduction Rule To make a dependent function, just like with a regular function you need an expression that will be the body of your dependent function, let’s call it \phi like last time. If you want your dependent function to have the type \prod_{x : A}P(x), you need to make sure that whenever x : A, \phi : P(x). Given this assumption \lambda \, x . \phi : \prod_{x : A} P(x).

Elimination Rule Assume f : \prod_{x : A} P(x) then one may use the dependent function f by applying it to input. So, given x : A, we have f(x) : P(x). Let me give an example of this rule, remember the function \mathrm{repeat} : \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A), what happens if you plug in 3 : \mathbb{N} for the input n? Well you get a vector of length 3 in A called f(3) : \mathrm{Vec}(3, A) as output.

Computation Rule Just the same as regular functions, (\lambda \, x. \phi)(y) \equiv \phi[x/y].

Uniqueness Principle Given f : \prod_{x : A} P(x) we have f \equiv \lambda \, x . f(x).

Just like before, dependent functions with multiple inputs will be defined using nested dependent functions. So for example, if I have a dependent function with two inputs a : A and b : B, I will create something with the type \prod_{a : A} \left( \prod_{b : B} \cdots \right), which I’ll simply write as \prod_{a : A} \prod_{b : B} \cdots. Also, if I have a dependent function which takes multiple inputs with the same type, say x : A and y : A, I’ll write the type as \prod_{x, y : A} for short.

Induction

How do you construct a dependent function out of the natural numbers? So, let’s say we have a type family P : \mathbb{N} \to \mathcal{U}, and we want to make a function f with f : \prod_{x : \mathbb{N}} P(x). How do we do this? The idea is very similar to the recursion principle we covered last time.

First of all, we need the value of the function on 0. Since our function f is supposed to have the type \prod_{x : A} P(x), when we put in 0 for x we expect to get some value f(0) : P(0). So let’s assume e_0 : P(0).

Just like with recursion, we need to handle natural numbers of the form \mathrm{succ}(n) somehow. Generalising the idea from the recursion principle, we assume e_\mathrm{succ} : \prod_{n : \mathbb{N}} P(n) \to P(n + 1). The function e_\mathrm{succ} maps any natural number n and the value of the function on n which has the type P(n) to the value of the function on \mathrm{succ}(n) which has the type P(\mathrm{succ}(n)).

The induction rule is also called the dependent elimination rule, or the dependent recursion rule since it is a generalisation to the setting of dependent types of the recusion/elimination rule we talked about last time.

Given these two pieces of information e_0 and e_\mathrm{succ}, we can construct a dependent function \mathrm{ind}_\mathbb{N}(e_0, e_\mathrm{succ}) : \prod_{n : \mathbb{N}} P(n). So that’s the dependent version of the elimination rule for natural numbers. The next question is, “What happens when I apply input to this function f?” The resulting computation rule is exactly the same as in the recursive case:

\begin{aligned} f(0) &\equiv e_0 \\ f(\mathrm{succ}(n)) &\equiv e_\mathrm{succ}(n, f(n)) \end{aligned}

Let’s apply this rule to define a function \mathrm{repeat} : \prod_{A : U} \prod_{a : A} \prod_{n : \mathbb{N}} \mathrm{Vec}(A, n), intuitively what \mathrm{repeat} does is to take a : A and n : \mathbb{N} and give me back the vector [a, a, a, .., a] : \mathrm{Vec}(n, A) which has n copies of a. Let’s start by introducing two variables for A : \mathcal{U} and a : A, and filling in a template for the induction principle for the natural numbers, since we want to define a dependent function that takes any n : \mathbb{N} and gives back a vector of n copies of a.

\begin{aligned} \mathrm{repeat} &: \prod_{A : U} \prod_{a : A} \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A) \\ \mathrm{repeat}(A, a) &:\equiv \mathrm{ind}_\mathbb{N}(e_0, e_\mathrm{succ}) \\ \end{aligned}

So we have to define e_0, and e_\mathrm{succ}. Remember that e_0 is the value of the function that gives a vector of n copies of a when n \equiv 0. It’s type is, recalling the definition, Vec(0, A) \equiv \mathbf{1}. So we have to construct an element of \mathbf{1}, and so we can trivially choose * : \mathbf{1} for this. So we can put * for e_0 in the template and get this:

\begin{aligned} \mathrm{repeat} &: \prod_{A : U} \prod_{a : A} \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A) \\ \mathrm{repeat}(A, a) &:\equiv \mathrm{ind}_\mathbb{N}(*, e_\mathrm{succ}) \\ \end{aligned}

Now we have to fill in e_\mathrm{succ}, what does it do? It is the recursive step that takes any n : \mathbb{N}, and the value of the function on n and gives us back the value of the function on \mathrm{succ}(n). Remember, we are trying to define a function that takes n and gives you back a vector n copies of n. The value of this function on n, will be a vector on n elements in A it will have the type \mathrm{Vec}(n, A). Similarly, the value of this function on \mathrm{succ}(n) will have the type \mathrm{Vec}(\mathrm{succ}(n), A), so to summarise we have to define:

e_\mathrm{succ} : \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A) \to \mathrm{Vec}(\mathrm{succ}(n), A)

Now let’s say n : \mathbb{N}, and \mathrm{repeat}_n : \mathrm{Vec}(n, A) are the inputs to e_\mathrm{succ}, what should the output be? First of all we can rewrite the type of the output using the definition, \mathrm{Vec}(\mathrm{succ}(n), A) \equiv A \times \mathrm{Vec}(n, A), which means that we have to output a pair of an element in A and a vector of n elements from A. Since e_\mathrm{succ} should add another copy of a to the start of the vector \mathrm{repeat}_n we have no choice but to output the pair (a, \mathrm{repeat}_n). So we should define e_\mathrm{succ} :\equiv \lambda \, n. \lambda \, \mathrm{repeat}_n. (a, \mathrm{repeat}_n), and plugging this in back into our template we get the following completed definition:

\begin{aligned} \mathrm{repeat} &: \prod_{A : U} \prod_{a : A} \prod_{n : \mathbb{N}} \mathrm{Vec}(n, A) \\ \mathrm{repeat}(A, a) &:\equiv \mathrm{ind}_\mathbb{N}(*, \lambda \, n. \lambda \, \mathrm{repeat}_n. (a, \mathrm{repeat}_n)) \\ \end{aligned}

Notice how we did not define repeat, and then prove separately that repeat gives a vector of length n, rather the type of \mathrm{repeat} shows that it gives a vector with the correct length by construction.

Dependent pairs

On to dependent pairs. Dependent pairs generalise the product types A \times B. A dependent type is a pair of things where the type of the second depends on the value of the first. The notation for the type is \sum_{x : A} P(x), where A : \mathcal{U} is a type and P : A \to \mathcal{U} is a type family.

Introduction Rule Given an appropriate pair of elements x : A, y : P(x) we may form (x, y) : \sum_{x : A} P(x).

Recursion principle Suppose we want to construct a function from the type of dependent pairs \sum_{x : A} P(x) to some other type X : U. Just like with the other types, the data we need will reflect the constructor for dependent pairs. Given a way to map the components of a dependent pair into X:

e_p : \prod_{x : A} P(x) \to X

There exists a function \mathrm{rec}_{\sum_{x : A} P(x)} : \sum_{x : A} P(x) that satisfies the following:

Computation rule Given x : A, and y : P(x) we have:

\mathrm{rec}_{\sum_{x : A} P(x)}((x, y)) \equiv e_p(x, y)

Induction principle Let’s now generalise the recursion rule to the dependent case. Suppose we have a type family X : \sum_{x : A} P(x) \to \mathcal{U}, equipped with data that reflects the constructor for a dependent pair,

e_p : \prod_{x : A} \prod_{y : P(x)} P((x, y)) \\

Then we have a dependent function \mathrm{rec}_{\sum_{x : A} P(x)}(e_p) : \prod_{p : \sum_{x : A} P(x)} P(p) satisfying the following:

Computation rule Given x : A, and y : P(x) we have:

\mathrm{rec}_{\sum_{x : A} P(x)}((x, y)) \equiv e_p(x, y)

As an easy example, let us define the projections that extract out the components of a dependent pair. First things first, let’s suppose A : \mathcal{U} and P : A \to \mathcal{U} and construct the first projection:

\mathrm{pr}_1 : \left(\sum_{x : A} P(x)\right) \to A \\ \mathrm{pr}_1 :\equiv \mathrm{rec}_{\sum_{x : A} P(x)}(...)

We have to fill in the dots with a function with the type \prod_{x : A} P(x) \to A but we can just ignore the given element of P(x) to give the following definition for \mathrm{pr}_1:

\mathrm{pr}_1 :\equiv \mathrm{rec}_{\sum_{x : A} P(x)}(\lambda \, x. \lambda \, y. x)

What about the second projection? It will take any pair p : \sum_{x : A} P(x) and give you its second component, which will have the type P(\mathrm{pr}_1(x)). So let’s define it then, starting with the template for the induction rule:

\mathrm{pr}_2 : \prod_{p : \sum_{x : A} P(x)} P(\mathrm{pr}_1(p)) \\ \mathrm{pr}_2 : \mathrm{ind}_{\sum_{x : A} P(x)}(e_p)

We have to define e_p : \prod_{x : A} \prod_{y : P(x)} P(\mathrm{pr}_1((x, y)), however, \mathrm{pr}_1((x, y)) \equiv x by the computation rule for recursion. So we just have to define e_p : \prod_{x : A} \prod_{y : P(x)} P(x), but this is trivial:

\mathrm{pr}_2 :\equiv \mathrm{ind}_{\sum_{x : A} P(x)}(\lambda \, x. \lambda \, y. y)

Dependent types and logic

Let’s talk about the logical interpretation of dependent types. First, in logical terms what is a type family. What does P : A \to \mathcal{U} mean in the context of logic. Well, P is a function that associates every element of A to a type. Since propositions are types, P is a proposition that depends on an elment of A, that is to say a predicate on A.

Now what about dependent functions? Given a type family P : A \to \mathcal{U}, the dependent function type \prod_{x : A} P(x) is a way to map every element x : A to some value in P(x). Logically speaking, we are mapping any element x : A to a corresponding proof of some proposition P(x) : \mathcal{U}, since propositions are types and in particular proving a proposition is just constructing an element of a particular type. So in other words we are showing that the proposition P(x) holds for every x : A, this means that the dependent function type \prod_{x : A} P(x) is the way we represent “for all” statements in type theory. To put it another way, the statement “for all x : A, P(x)” corresponds to the type \prod_{x : A} P(x).

This clarifies why we call the dependent generalisation of the recursion rule, the “induction principle.” On the natural numbers for example, the induction principle shows you how to construct a dependent function \prod_{x : \mathbb{N}} P(x) by handling the base case with some e_0 : P(0) and the recursive case with some e_\mathrm{succ} : \prod_{n : N} P(n) \to \mathcal{U}. The usual principle of mathematical induction that gives a method for proving for all statements, is just a special case of the induction principle for natural numbers.

Finally what about dependent pairs? Try to guess what their logical interpretation is before reading on. The type of dependent pairs \sum_{x : A} P(x) is generated by pairs (x, y) : \sum_{x : A} P(x) where x : A and y : P(x). So dependent pairs are pairs of elements x : A together with proofs that some proposition P holds for that element. Naturally, we interpret the type of dependent pairs \sum_{x : A} P(x), as the proposition “there exists an element x : A such that P(x).” What the recursion rule for dependent pairs says, logically speaking is that given a proposition X : \mathcal{U}, we can show that \sum_{x : A} P(x) implies X by showing \prod_{x : A} P(x) \to X. Or in more words, by assuming an arbitrary element of x : A exists, such that P(x) and deducing X from that.

As an application of everything we’ve learned so far, we will prove the “type theoretic axiom of choice.” Suppose we are given A : \mathcal{U}, B : A \to \mathcal{U}, and a type family P : \prod_{x : A} B(x) \to \mathcal{U}

AC : \left(\prod_{x : A} \sum_{b : B(x)} P(x, b)\right) \to \sum_{f : \prod_{x : A} B(x)} \prod_{x : A} P(x, f(x))

This seems at first glance to be very similar to the set theoretic axiom of choice. We have a type A, and a family of types B associated with every element of A. If, for every x : A there is at least one b : B(x) both satisfying some property P then there is a way of mapping these elements x : A to a choice of b : B(x) called f. This f satisfies the property that the chosen elements f(x) : B(x) all satisfy P.

The proof is quite trivial, I will write it with pattern matching notation first:

AC(s) :\equiv (\cdots, \cdots)

Let us first construct the first component of the output of AC, which is a function f : \prod_{x : A} B(x). Let x : A, we need to construct B(x), however we can get the first component of s(x) : \sum_{b : B(x)} P(x, b) so we define f :\equiv \lambda \, x. \mathrm{pr}_1(s(x))

AC(s) :\equiv (\lambda \, x . \mathrm{pr}_1(s(x)), \cdots)

Now what about the second component, it needs to be a function \prod_{x : A} P(x, f(x)) and since f(x) \equiv \mathrm{pr}_1(s(x)) we just need to construct \prod_{x : A} P(x, \mathrm{pr}_1(s(x))). Obviously, the appropriate thing to use is the first projection since \lambda x \,. \mathrm{pr}_2(s(x)) has the type we want.

AC(s) :\equiv (\lambda \, x . \mathrm{pr}_1(s(x)), \lambda \, x . \mathrm{pr}_2(s(x)))

Done! Now we should reflect on this for a moment, in type theory we can prove this version of the axiom of choice it is not an axiom. Why is it the case that a similar seeming statement in set theory needs to be assumed but the equivalent version in type theory can be proven. Essentially, there is a subtle difference between the notion of “there exists,” that we have in type theory based on dependent pairs and the traditional set theoretic notion. In type theory, existence means being able to construct an object satisfying particular properties. In set theory, this is not necessarily the case one can show that something exists without giving a method for constructing it. So, in type theory the statement “for every x : A there exits a corresponding b : B(x) such that P(x, b)” is a method of constructing a b from a given x such that we always have P(x, b) and we just have to rearrange the data into two separate parts: firstly a method for constructing a b from a given x this is the function f, and secondly a proof that this method always outputs elements that satisfy P. There are several ways of recovering the traditional notion of “existence,” which does not consist of construction which we may talk about later. This is a special case of the general distinction between classical, a.k.a. traditional logic and constructive logic such as the one we have based on type theory. A full discussion of this distinction would be interesting, but unfortunately this blog post is already too long. So here are exercises, I can’t give more interesting ones since we haven’t defined equality yet:

Exercise Implement concatenation for vectors. First implement it using pattern matching notation, and then translate the pattern matching notation to the recursion/induction principles for the appropriate types. Notice that if you fit your implementation of concatenation to your implementation of addition for natural numbers then you will make your life much easier.

Exercise Let’s say I have a type A : \mathcal{U} and a function f : A \to B, I can apply this function to each component of a vector \mathrm{Vec}(n, A) to get a vector \mathrm{Vec}(n, B). Implement a general procedure for applying a given function to each component of a vector:

map : \prod_{n : \mathbb{N}} \prod_{f : A \to B} \mathrm{Vec}(n, A) \to \mathrm{Vec}(n, B)

Exercise Implement a procedure that calculates the product of all the components of a vector \mathrm{Vec}(n, \mathbb{N}), and a similar procedure that implements the sum of all the components. If you see a similarity between these two, define a general procedure that captures that similarity and rewrite your definitions in term of this general procedure.

How clear is this post?