Idris, dependent types and IO

The purpose of this post is to demonstrate the interaction between dependent types and IO in Idris.

In the examples, for a given IO string, we will pass it through a total and a non-total function to demonstrate how Idris handles both cases.

We will define a dependent type called Vect which will allow us to work with lists, but additionally it will contain the length of the list at type level as follows:

data Vect : Nat -> Type -> Type where
   VNil  : Vect Z a
   VCons : a -> Vect k a -> Vect (S k) a

So, in the following demonstration we will see how both the safe and unsafe code pass compile-time, but the latter may fail on run-time.

Safe:

total
wat_total : List Char -> Vect 2 Char
wat_total (x :: y :: []) = VCons x (VCons y VNil)
wat_total _ = VCons 'a' (VCons 'b' VNil)

unwat : Vect 2 Char -> List Char
unwat (VCons a (VCons b VNil)) = a :: b :: []

wrapunwrap_safe : String -> String
wrapunwrap_safe name = pack (unwat (wat_total (unpack name)))

greet_safe : IO ()
greet_safe = do
    putStr "What is your name? "
    name <- getLine
    putStrLn ("Hello " ++ (wrapunwrap_safe name))

main : IO ()
main = do
    putStrLn "Following safe greet, enter any number of chars"
    greet_safe

If we specify total to the wat_total function, it must produce a value for all possible inputs, and this is okay and safe. All the rest of the checks are done on compile time and the compiler will force you to be sure that any callers to wat_total have the types flow correctly.

Unsafe:

wat : List Char -> Vect 2 Char
wat (x :: y :: []) = VCons x (VCons y VNil)

unwat : Vect 2 Char -> List Char
unwat (VCons a (VCons b VNil)) = a :: b :: []

wrapunwrap : String -> String
wrapunwrap name = pack (unwat (wat (unpack name)))

greet : IO ()
greet = do
    putStr "What is your name? "
    name <- getLine
    putStrLn ("Hello " ++ (wrapunwrap name))

main : IO ()
main = do
    putStrLn "Following unsafe greet, you can only enter chars of size 2"
    greet

The tricky case is when we’re dealing with non-total functions. Let’s check a few return values:

*wat> wat []
wat [] : Vect 2 Char
*wat> wat ['a']
wat ['a'] : Vect 2 Char
*wat> wat ['a','b']
VCons 'a' (VCons 'b' VNil) : Vect 2 Char
*wat> wat ['a','b','c']
wat ['a', 'b', 'c'] : Vect 2 Char

We only see that we get a correct value for wat ['a', 'b']. The rest of the examples we can see that the value is “not computed”, kind of.

But, how does Idris handle this at run-time?

*wat> :exec
Following unsafe greet, you can only enter chars of size 2
What is your name? Hello
*wat> :exec
Following unsafe greet, you can only enter chars of size 2
What is your name? Hi
Hello Hi
*wat> 

We see that Idris stopped the execution of the process. What happens if we use the `node` target for compiling?

boro@bor0:~/idris-test$ idris --codegen node wat.idr -o wat.js
boro@bor0:~/idris-test$ node wat.js 
Following unsafe greet, you can only enter chars of size 2
What is your name? Hello
/Users/boro/idris-test/wat.js:177
    $cg$7 = new $HC_2_1$Prelude__List___58__58_($cg$2.$1, new $HC_2_1$Prelude__List___58__58_($cg$9.$1, $HC_0_0$Prelude__List__Nil));
                                                                                                    ^

TypeError: Cannot read property '$1' of undefined
    at Main__greet (/Users/boro/idris-test/wat.js:177:101)
    at Main__main (/Users/boro/idris-test/wat.js:187:12)
    at $_0_runMain (/Users/boro/idris-test/wat.js:231:25)
    at Object.<anonymous> (/Users/boro/idris-test/wat.js:235:1)
    at Object.<anonymous> (/Users/boro/idris-test/wat.js:236:3)
    at Module._compile (module.js:660:30)
    at Object.Module._extensions..js (module.js:671:10)
    at Module.load (module.js:573:32)
    at tryModuleLoad (module.js:513:12)
    at Function.Module._load (module.js:505:3)
boro@bor0:~/idris-test$ node wat.js 
Following unsafe greet, you can only enter chars of size 2
What is your name? Hi
Hello Hi

Whoops. What about C?

boro@bor0:~/Desktop/idris-test$ idris --codegen C wat.idr -o wat
boro@bor0:~/Desktop/idris-test$ ./wat
Following unsafe greet, you can only enter chars of size 2
What is your name? Hello
Segmentation fault: 11
boro@bor0:~/Desktop/idris-test$ ./wat
Following unsafe greet, you can only enter chars of size 2
What is your name? Hi
Hello Hi

It segfaults on the bad case.

As a conclusion, make sure your non-total functions have checks on the run-time as well.

Alternatively, if we want to take full advantage of the type safety that dependent types may offer, we need to make sure that all of our functions are total ๐Ÿ™‚

Type systems and proofs

Today at KIKA Hackerspace we had an interesting event related to proofs and type systems.

We started with Coq syntax for inductive types, definitions and fixpoints. Then proceeded by defining natural numbers in Coq along with arithmetic.

So far so good, we can do the same stuff with TypeScript and most other typed programming languages.

Then we watched Idris: type safe printf which was an interesting introduction to how Idris supports dependent types.

Going back to Coq, we proceeded by defining product type pair and shown the Curry-Howard correspondence (types are theorems, programs are proofs) between swap and commutative AND.

We took a look at tactics and how they can generate lambda abstractions and applications based on some automation.

Entering the dependent types realm, this is where languages like TypeScript are limited in a sense, since you cannot make a type based on some values (note it is different to make a type based on some type).

Dependent types allow for some interesting stuff like proving specific propositions where propositions are types (e.g. forall x : X, x > 2 -> x > 1), but they also allow forall itself to be defined in terms of them. The reason most IO languages don’t support them is because of the undecidability problem.

The take away was that, depending on the abstraction level we are working, types and proofs can give us a sense of safety based on some truths we take for granted (axioms), and this is how we program everyday. We have an initial base we trust and from there we start building our abstractions.

However, this is not always easy to achieve. E.g. consider we have a button that is supposed to download a PDF. In order to prove that, you first pick the abstraction level and then proceed by defining things (what is a PDF, what is a download).

Bonus: Since it seems that Idris supports both IO and dependent types, we discussed some interesting and weird cases such as, what happens if you compile a program which types depend on some SQL database, and then after a while the schema changes. We would still have to handle stuff at runtime if we were to run an old executable.

Top 3 applications I mostly use

Here’s a list of items I spend most time of, daily (both work and non-work related):

  1. Terminal – This should not come as a surprise. vim is my main tool for editing code, and I spend a lot of time on it. I also spend a lot of time using grep when searching for something, or running PHP/Coq/etc from command line to quick test some code. git is also there.
  2. Chrome – After I write the code, I need to test it. Using the browser usually involves testing, but also stuff like social media, this blog, or reading something else interesting.
  3. Notes – I don’t think about this application that much, but it’s a fact that I spend a lot of time using it (both mobile and desktop). You can write just anything in Notes, but I classify mine mostly to be:
    1. TODOs
    2. Schedules, appointments
    3. Temporary information (what work I’ve done for the week, random thoughts, ideas (e.g. this post))
    4. Personal information (card id, pin codes, bike codes)
    5. Short Math proofs

There’s my list. What applications do you use on a daily basis? Feel free to put some in the comments section.

Associativity of Elvis operator

I had written a piece of code in the format a ?: (b ?: c) and I was wondering if I could omit the brackets. That’s right, we need to figure if this operator is associative.

That is, we need to see if we can show that (a ?: b) ?: c = a ?: (b ?: c).

The definition of it is:
a ?: b = a (if a is not empty)
a ?: b = b (otherwise)

So that means we can consider these cases:

  1. Suppose a is empty.
    Now we are left with b ?: c = b ?: c, which is true by reflexivity.
  2. Suppose a is not empty.
    Then, (a ?: b) evaluates to a, so we have a ?: c on the LHS, and thus a ?: c = a ?: (b ?: c).

Since a is not empty, and by definition of this operator, we have that a = a, which is true by reflexivity.

Bonus: here’s the same proof in Coq:

(* Definition of data type for empty and value one *)
Inductive value {a : Type} : Type :=
  | Value : a -> value
  | Empty : value.

(* Definition of Elvis operator *)
Definition elvis {x : Type} (x1 : @value x) (x2 : @value x) : (@value x) :=
  match x1 with
  | Empty => x2
  | Value _ => x1
  end.

(* Fancy notation for Elvis operator *)
Notation "x ?: y" := (elvis x y) (at level 50).

Theorem elvis_assoc : forall (X : Type) (x1 : @value X) (x2 : @value X) (x3 : @value X),
  x1 ?: (x2 ?: x3) = (x1 ?: x2) ?: x3.
Proof.
  intros X x1 x2 x3.
  case x1.
  - (* x1 is not empty *) simpl. reflexivity.
  - (* x1 is empty     *) simpl. reflexivity.
Qed.

Now that both considered cases are true, we’ve shown that ?: is associative and you can skip any brackets in your code ๐Ÿ™‚

Metamath

During my research on various theorem provers, today I stumbled upon Metamath.

To me, what was interesting about it was its simplicity. You start by defining a formal system, i.e. variables, symbols, axioms, inference rules. Then you build new theorems based on the formal system.

The core concept behind Metamath is substitution, and it’s using RPN notation to build hypothesis on the stack and then rewrite them using the inference rules to reach conclusion.

The program is written in C, and I compiled it on both OS X and my Android phone. It’s pretty light-weight and compiles in a few milliseconds, and it’s also interesting that you can take it anywhere you want on your phone.

Metamath has no special syntax. It will tokenize a file that we give to it, and tokens that start with $ are Metamath tokens, while everything else are user-defined tokens.

Here is a list of built-in Metamath tokens:

  • To define constants, use the $c token.
  • To define variables, use the $v token.
  • To define types of variables, use the $f token.
  • To define essential hypothesis, use the $e token.
  • To define axioms, use the $a token.
  • To define proofs, use the $p token.
  • To start proving in $p statement, use the $= token.
  • To end the statements above, use the $. token.
  • To insert comments, use the $( and $) tokens.
  • To define a block (has effect on scoping), use the ${ and $} tokens. Note that only $a and $p tokens will remain outside the scope.

That’s basically it. The package has included demo example and an example for the MU puzzle as well. There are other systems as well (Peano, Set).

There are some basic rules as well:

  • A hypothesis is either a $e or a $f statement
  • For every variable in $e, $a or $p, we must have a $p assertion, i.e. every variable in an essential hypothesis/axiom/proof must have a floating hypothesis (type)
  • A $f, $e, or $d statement is active from the place it occurs until the end of the block it occurs in. A $a or $p statement is active from the place it occurs through the end of the database.

For the complete syntax you can refer to the Metamath book.

Now for the example we’ll use, start by creating a test.mm file. We’ll define a formal system and demonstrate the usage of modus ponens to come up with a new theorem, based on our initial axioms.

$( Declare the constant symbols we will use $)
$c -> ( ) wff |- I J $.

$( Declare the variables we will use $)
$v p q $.

$( Specify properties of the variables, i.e. they are wff formulas $)
wp $f wff p $.
wq $f wff q $.

$( Define "mp", for the modus ponens inference rule $)
${
mp1 $e |- p $.
mp2 $e |- ( p -> q ) $.
mp  $a |- q $.
$}

$( Define our initial axioms. I and J are well-formed formulas,
we have a proof for I and we have conditional for I -> J $)
wI  $a wff I $.
wJ  $a wff J $.
wim $a wff ( p -> q ) $.

$( Prove that we can deduce J from the initial axioms. Note how we use
block scope here, since we don't want the hypothesis proof_I and
proof_I_imp_J to be visible outside of this scope. $)
${
$( Given I and I -> J $)
proof_I $e |- I $.ยง
proof_I_imp_J $e |- ( I -> J ) $.

$( Prove that we can deduce J $)
proof_J $p |- J $=
wI  $( Stack: [ 'wff I' ] $)
wJ  $( Stack: [ 'wff I', 'wff J' ] $)

$( Note that we had to specify wff for I and J before using mp,
since the types have to match, as set on line 8 and 9 $)

proof_I       $( Stack: [ 'wff I', 'wff J', '|- I' ] $)
proof_I_imp_J $( Stack: [ 'wff I', 'wff J', '|- I', '|- ( I -> J )' ] $)

mp  $( Stack: [ '|- J' ] $)
$.
$}

To verify our file, we run ./metamath 'read test.mm' 'verify proof *' exit.

Note how we had to separate wff from |-. Otherwise, if we just used |-, then all well formed formulas would be proven to be true, which doesn’t make much sense.

In addition to this, the reason why we have implication -> and turnstile provable assertion |- is that the former is a wff (i.e. statement in the object language) and works on propositions, while the latter is not a wff but rather a statement in the meta language and works on proofs.

The arrow -> is usually used to denote an “internalization” of the meaning of turnstile, into the object language. This allows us, in some sense (depending on the meaning ascribed to ->) to represent the relationships between hypotheses and conclusions in the object language, whereas without it it becomes difficult to reason about on a higher-order level (e.g. most systems do not allow A, (A |- B) |- B).

For more complete examples, you can check out:

  • logic.mm – where I define a basic logical system that contains definitions for and-elim and and-intro and proves some theorems.
  • peano.mm – where I define successor for natural numbers and prove some theorems about ordering of them.

Due to its minimalistic design, when compared to Coq it has no Calculus of Constructions, Inductive Types, etc.

To conclude, I think it’s a fun program to play with, but since it has no “real” framework, I don’t think it’s as industry ready as Coq.