Why Dependent Types matter

tl;dr: With respect to expressive power in logic, Dependent Types matter because they allow us to express universal and existential quantification.

Most programming languages (Python, JavaScript) have the power to express propositional logic. For example, you can express AND, NOT, OR in Python, JavaScript or any other mainstream programming language. See my previous post Hierarchy of logical systems for more details on various logical systems.

However, if we are interested in mathematical proofs (and software correctness built upon those), then we need to work with higher order logic where we have the notion of universal (forall) and existential (exists) quantificators.

So, what are Dependent Types? Wikipedia formally defines them as:

Loosely speaking, dependent types are similar to the type of an indexed family of sets. More formally, given a type A:U in a universe of types U, one may have a family of types B:A->U, which assigns to each term a:A a type B(a):U. We say that the type B(a) varies with a.

This definition allows us to map (transform) terms to types, thus bringing closer the notion of values and types.

Let’s have a look at an Idris example of dependent types:

f : Nat -> Type
f Z = Bool
f x = String

As a result:

Idris> f 0
Bool : Type
Idris> f 1
String : Type
Idris> f 2
String : Type

Python example of “dependent types”:

def f(x):
	if x == 0:
		return bool
	else:
		return str

As a result:

>>> f(0)
<type 'bool'>
>>> f(1)
<type 'str'>
>>> f(2)
<type 'str'>

So we seem to be able to encode those in Python as well. But, there is no compile-step in Python to let us know that the types flow correctly.

So we can’t really take advantage of having the programming language do all the necessary logical checks for us in order to form a notion of mathematical proof.

Where Idris really shines is in the following example:

f : {a : Nat} -> GT a 0 -> String
f {a} proof_that_a_gt_zero = "It is type safe that " ++ (show a) ++ " is greater than zero!"

produce_necessary_proof : (a : Nat) -> LTE 1 (S a)
produce_necessary_proof a = (LTESucc {left=Z} {right=a}) (LTEZero {right=a})

safe : Nat -> String
safe Z     = "Whoops!"
safe (S x) = f (produce_necessary_proof x)

Let’s try to understand this example by breaking it up.

1. LTESucc and LTEZero is a constructor for the data type LTE. A few examples for LTEZero:

LTEZero : LTE 0 0
Idris> LTEZero {right = 1}
LTEZero : LTE 0 1
Idris> LTEZero {right = 2}
LTEZero : LTE 0 2

LTESucc allows us to say that if we have a proof of LTE a b then we can construct a proof of LTE (a + 1) (b + 1):

Idris> :t LTESucc
LTESucc : LTE left right -> LTE (S left) (S right)

GT a 0 is just a friendly wrapper around LTE 1 a.

2. Let’s try to explain what f, produce_necessary_proof and safe do.
So f is a function that takes any natural number greater than zero, and a proof that it is greater than zero, and outputs a string for it.

It’s like saying, mathematically, that forall x in N, x > 0 -> “It is type safe that {x} is greater than zero!”. We can see how close these relationships are represented.

So to produce the proof that a natural number is greater than zero (LTE 1 x), we implement produce_necessary_proof. Examples:

Idris> produce_necessary_proof 0
LTESucc LTEZero : LTE 1 1
Idris> produce_necessary_proof 1
LTESucc LTEZero : LTE 1 2
Idris> produce_necessary_proof 2
LTESucc LTEZero : LTE 1 3

safe then just combines f with produce_necessary_proof.

3. Compile-time checks that f only accepts a natural number greater than zero. While this is a purely mathematical example, we can also do industry examples…

Now, given all of the above, we can’t really do that in Python (or most mainstream languages) and have the compiler do all the checks for us and produce proofs for them 🙂

Proofs with Idris

First, let’s try to understand what Idris’s Refl does. It is defined as such:

data (=) : a -> b -> Type where
    Refl : x = x

Now, let’s try to implement nats ourselves:

data MyNat = Zero | Next MyNat

my_add : MyNat -> MyNat -> MyNat
my_add Zero     m = m
my_add (Next k) m = Next (my_add k m)

What Idris does here at compile time, when it sees Refl it tries to evaluate the values by the given definitions. So it understands that (0 + 1) = 1, by my_add and definition of MyNat. E.g.:

my_proof : my_add Zero (Next Zero) = Next Zero
my_proof = Refl

Now let’s try to write functions (both head and tail recursive) that will convert a list to a vector so we have the length of the list at type level.

data Vect : Nat -> Type -> Type where
   VNil  : Vect Z a
   VCons : a -> Vect k a -> Vect (S k) a

We will prove watwat with the following type. We’ll see how and why we’ll be using it later.

watwat : (left : Nat) -> (right : Nat) -> S (plus left right) = plus left (S right)
watwat Z right = Refl -- to prove S (plus 0 right) = plus 0 (S right) it's just reflexivity, by def of plus

Now for the other case (first argument not zero), if we write a hole:

watwat (S left) right = ?hole

and check what we need to fill in: hole : S (S (plus left right)) = S (plus left (S right))

So in theory, we need to pass a lambda function which will take values from the previous type and produce values to match the new type.

That is, what we need to show here is that we can go from
S (plus left right) = plus left (S right)
to
S (plus (S left) right) = plus (S left) (S right) [original definition],
i.e. S (S (plus left right)) = S (plus left (S right)) [simplified by Idris when we checked the hole].

So now, it’d be good if we had a function with type (a = b) -> f a = f b. So we could’ve just used it with (watwat left right) to get the correct match. Core already has cong and it is exactly what we need.

watwat (S left) right = cong (watwat left right)

We proved it using lambda terms. To prove it with tactics, equivalently:

watwat (S left) right =
    let inductiveHypothesis = watwat left right in
        rewrite inductiveHypothesis in Refl
-- Or, even simpler, just watwat = plusSuccRightSucc

Now let’s go ahead and implement a function that converts a list to a vector (the vector will know the length at type level):

-- The following head recursion works fine, because types flow nicely
f : (l : List Char) -> Vect (length l) Char
f [] = VNil
f (x :: xs) = VCons x (f xs)

Now let’s implement the tail recursion of the same computation

f_tail : Vect n Char -> (l : List Char) -> Vect (length l + n) Char
f_tail v [] = v

The following won’t work because types won’t match:

f_tail {n} v (x :: xs) = f_tail (VCons x v) xs

If, instead we use a hole:

f_tail {n} v (x :: xs) = ?hole

We get that hole : Vect (S (plus (length xs) n)) Char

So we need to go from
Vect (length xs + S n) Char (Type of f_tail (VCons x v) xs)
to
Vect (S (plus (length xs) n)) Char (Expected type).

We can use our watwat theorem to get to it.

f_tail {n} v (x :: xs) =
	let inductiveHypothesis = watwat (length xs) n in
		rewrite inductiveHypothesis in (f_tail (VCons x v) xs)

To implement this in lambda terms, we first introduce ourselves to replace and sym:

*induction> :t replace
replace : (x = y) -> P x -> P y
*induction> :t sym
sym : (left = right) -> right = left
*induction> 

Example usage:

f : 3 = (4 - 1)
f = Refl

Now note how Idris evaluates the types:

*test> :t replace
replace : (x = y) -> P x -> P y
*test> :t replace {f = \x => Vect x Char} f
cong f : Vect 3 Char = Vect 3 Char

So proceeding with our implementation in lambda terms:

f_tail {n} v (x :: xs) =
	f_tail {n} v (x :: xs) = replace {P = \m => Vect m Char} (sym (watwat (length xs) n)) (f_tail_l (VCons x v) xs)

To recap what we did, by Curry-Howard isomorphism (proof by lambda construction) we proved the following theorems:

(left : Nat) -> (right : Nat) -> S (plus left right) = plus left (S right)
(l : List Char) -> Vect (length l) Char
Vect n Char -> (l : List Char) -> Vect (length l + n) Char

Bonus:
We can use :set showimplicits to see the implicit arguments, so that we can pass stuff to them.

Given:

f : 3 = 3
f = Refl

Here’s an example of cong which doubles an equality:

double : (x = x) -> (x + x = x + x)
double n = cong {f = \x => x + x} n

And another example of replace which replaces the type:

test_replace : {n : Type} -> (x = x) -> n -> n
test_replace f = replace {P = \x => n} f

Results in:

*test> double f
Refl : 6 = 6
*test> test_replace {n = Bool} f
replace Refl : Bool -> Bool

Dependent types in typed Racket

Racket recently introduced support for dependent types in the 6.11 release. This is very cool.

For untyped Racket, we have contracts. So we can do something like:

#lang racket

(define/contract (foo a b)
  (-> integer? real? real?)  ; compare:  foo : int real -> real
  (+ a b))

(foo 3 4.2) ; works fine.
; (foo 4.2 3) ; Gives an error.

But the commented code (foo 4.2 3) will only fail on run-time. Not that useful. Enter typed/racket.

Let’s try to implement reflexivity and see if the compiler will do the proper calculation on the types:

#lang typed/racket

(: refl 
   (-> ([v : Integer]
        [n : (v) (Refine [i : Integer]
                         (= i v))])
       Boolean))

(define (refl x y) (= x y))

Neat, now if we try to prove that 4 = 4, i.e. doing:

(define (wat) (refl (+ 1 3) 4))

It will work fine. However,

(define (wat2) (refl 4 5))

fails on compile-time which is very useful:

 Type Checker: type mismatch
  expected: (Refine (x : Integer) (= 4 x))
  given: (Refine (y : Positive-Byte) (= 5 y)) in: 5

So to demonstrate the dependent types support, we have vector-ref which takes a vector and an index and returns the element at the given index.

(: safe-vector-ref 
   (All (A) (-> ([v : (Vectorof A)]
                 [n : (v) (Refine [i : Integer]
                            (<= 0 i (- (vector-length v) 1)))])
                A)))
(define safe-vector-ref vector-ref)

Now we can try using safe-vector-ref:

(safe-vector-ref (vector 0 1) 0) ; 0
(safe-vector-ref (vector 0 1) 1) ; 1
;(safe-vector-ref (vector 0 1) 2)
#| Compiler error:
Type Checker: Polymorphic function `safe-vector-ref' could not be applied to arguments:
Argument y (position 1):
  Expected: (Vectorof A)
  Given:    (Vector Integer Integer)
Argument z (position 2):
  Expected: (Refine (d : Integer) (and (<= d (- (vector-length y) 1)) (<= 0 d)))
  Given:    (Refine (f : Positive-Byte) (= 2 f))

Result type:     A
Expected result: AnyValues
 in: (safe-vector-ref (vector 0 1) 2)
|#

Bonus: this is an implementation of a vector that contains its length at the type level:

#lang typed/racket
(: make-sized-vector
   (All (A) (-> ([v : Integer]
                 [n : (Vectorof Integer)])
                #:pre (v n)
                (= (vector-length n) v)
                (Vectorof Integer))))
(define (make-sized-vector n l) l)

;(make-sized-vector 2 '(1 2 3)) -- compile time error
(make-sized-vector 3 (vector 1 2 3))

We can definitely see a trend in how other popular languages will be moving towards having support for Dependent Types 🙂

Idris, dependent types and IO

The purpose of this post is to demonstrate the interaction between dependent types and IO in Idris.

In the examples, for a given IO string, we will pass it through a total and a non-total function to demonstrate how Idris handles both cases.

We will define a dependent type called Vect which will allow us to work with lists, but additionally it will contain the length of the list at type level as follows:

data Vect : Nat -> Type -> Type where
   VNil  : Vect Z a
   VCons : a -> Vect k a -> Vect (S k) a

So, in the following demonstration we will see how both the safe and unsafe code pass compile-time, but the latter may fail on run-time.

Safe:

total
wat_total : List Char -> Vect 2 Char
wat_total (x :: y :: []) = VCons x (VCons y VNil)
wat_total _ = VCons 'a' (VCons 'b' VNil)

unwat : Vect 2 Char -> List Char
unwat (VCons a (VCons b VNil)) = a :: b :: []

wrapunwrap_safe : String -> String
wrapunwrap_safe name = pack (unwat (wat_total (unpack name)))

greet_safe : IO ()
greet_safe = do
    putStr "What is your name? "
    name <- getLine
    putStrLn ("Hello " ++ (wrapunwrap_safe name))

main : IO ()
main = do
    putStrLn "Following safe greet, enter any number of chars"
    greet_safe

If we specify total to the wat_total function, it must produce a value for all possible inputs, and this is okay and safe. All the rest of the checks are done on compile time and the compiler will force you to be sure that any callers to wat_total have the types flow correctly.

Unsafe:

wat : List Char -> Vect 2 Char
wat (x :: y :: []) = VCons x (VCons y VNil)

unwat : Vect 2 Char -> List Char
unwat (VCons a (VCons b VNil)) = a :: b :: []

wrapunwrap : String -> String
wrapunwrap name = pack (unwat (wat (unpack name)))

greet : IO ()
greet = do
    putStr "What is your name? "
    name <- getLine
    putStrLn ("Hello " ++ (wrapunwrap name))

main : IO ()
main = do
    putStrLn "Following unsafe greet, you can only enter chars of size 2"
    greet

The tricky case is when we’re dealing with non-total functions. Let’s check a few return values:

*wat> wat []
wat [] : Vect 2 Char
*wat> wat ['a']
wat ['a'] : Vect 2 Char
*wat> wat ['a','b']
VCons 'a' (VCons 'b' VNil) : Vect 2 Char
*wat> wat ['a','b','c']
wat ['a', 'b', 'c'] : Vect 2 Char

We only see that we get a correct value for wat ['a', 'b']. The rest of the examples we can see that the value is “not computed”, kind of.

But, how does Idris handle this at run-time?

*wat> :exec
Following unsafe greet, you can only enter chars of size 2
What is your name? Hello
*wat> :exec
Following unsafe greet, you can only enter chars of size 2
What is your name? Hi
Hello Hi
*wat> 

We see that Idris stopped the execution of the process. What happens if we use the `node` target for compiling?

boro@bor0:~/idris-test$ idris --codegen node wat.idr -o wat.js
boro@bor0:~/idris-test$ node wat.js 
Following unsafe greet, you can only enter chars of size 2
What is your name? Hello
/Users/boro/idris-test/wat.js:177
    $cg$7 = new $HC_2_1$Prelude__List___58__58_($cg$2.$1, new $HC_2_1$Prelude__List___58__58_($cg$9.$1, $HC_0_0$Prelude__List__Nil));
                                                                                                    ^

TypeError: Cannot read property '$1' of undefined
    at Main__greet (/Users/boro/idris-test/wat.js:177:101)
    at Main__main (/Users/boro/idris-test/wat.js:187:12)
    at $_0_runMain (/Users/boro/idris-test/wat.js:231:25)
    at Object.<anonymous> (/Users/boro/idris-test/wat.js:235:1)
    at Object.<anonymous> (/Users/boro/idris-test/wat.js:236:3)
    at Module._compile (module.js:660:30)
    at Object.Module._extensions..js (module.js:671:10)
    at Module.load (module.js:573:32)
    at tryModuleLoad (module.js:513:12)
    at Function.Module._load (module.js:505:3)
boro@bor0:~/idris-test$ node wat.js 
Following unsafe greet, you can only enter chars of size 2
What is your name? Hi
Hello Hi

Whoops. What about C?

boro@bor0:~/Desktop/idris-test$ idris --codegen C wat.idr -o wat
boro@bor0:~/Desktop/idris-test$ ./wat
Following unsafe greet, you can only enter chars of size 2
What is your name? Hello
Segmentation fault: 11
boro@bor0:~/Desktop/idris-test$ ./wat
Following unsafe greet, you can only enter chars of size 2
What is your name? Hi
Hello Hi

It segfaults on the bad case.

As a conclusion, make sure your non-total functions have checks on the run-time as well.

Alternatively, if we want to take full advantage of the type safety that dependent types may offer, we need to make sure that all of our functions are total 🙂

Type systems and proofs

Today at KIKA Hackerspace we had an interesting event related to proofs and type systems.

We started with Coq syntax for inductive types, definitions and fixpoints. Then proceeded by defining natural numbers in Coq along with arithmetic.

So far so good, we can do the same stuff with TypeScript and most other typed programming languages.

Then we watched Idris: type safe printf which was an interesting introduction to how Idris supports dependent types.

Going back to Coq, we proceeded by defining product type pair and shown the Curry-Howard correspondence (types are theorems, programs are proofs) between swap and commutative AND.

We took a look at tactics and how they can generate lambda abstractions and applications based on some automation.

Entering the dependent types realm, this is where languages like TypeScript are limited in a sense, since you cannot make a type based on some values (note it is different to make a type based on some type).

Dependent types allow for some interesting stuff like proving specific propositions where propositions are types (e.g. forall x : X, x > 2 -> x > 1), but they also allow forall itself to be defined in terms of them. The reason most IO languages don’t support them is because of the undecidability problem.

The take away was that, depending on the abstraction level we are working, types and proofs can give us a sense of safety based on some truths we take for granted (axioms), and this is how we program everyday. We have an initial base we trust and from there we start building our abstractions.

However, this is not always easy to achieve. E.g. consider we have a button that is supposed to download a PDF. In order to prove that, you first pick the abstraction level and then proceed by defining things (what is a PDF, what is a download).

Bonus: Since it seems that Idris supports both IO and dependent types, we discussed some interesting and weird cases such as, what happens if you compile a program which types depend on some SQL database, and then after a while the schema changes. We would still have to handle stuff at runtime if we were to run an old executable.