Programming using logic assistants

What does a mathematician do? They play with theorems.

What does a smart mathematician do? Knows the theorems, but not necessarily remembers and knows all of them on top of their head. In fact, that’s probably impossible. They use proof assistants or logic programming languages.

I think a similar conclusion can be drawn for programmers.

At my current job, I get to work with different parts of a complex system constantly. I cannot and it is not expected from me to remember all the parts or decisions for the system, so often we need input from either Architecture or Product Managers. This is where communication jumps. We use JIRA/HipChat and other tools to ease our communication.

But I was thinking, what if we could use logic programming languages as our personal assistants to draw conclusions about decisions of the system’s behaviour?

Consider a complex system S, together with parts P = {P_1, P_2, ..., P_n}. Consider all possible inputs as I = {I_1, I_2, ..., I_n}. Each part might need different inputs. So we can state that f : P \to I. Consider we’re working on part P_k. So f(P_k) is the set of inputs needed for that part specifically, i.e. f(P_k) is a subset of I. All of the values in f(P_k) are unknown. But at some point in time, you get to know their values. And at this point is where you bring a decision.

We can take Prolog as an example. We start with an initial knowledge base, which probably explains some of the values of f(P_k), and as we get more and more input, we keep improving the knowledge until the point f(P_k) = \emptyset , i.e. no further input is required to bring a decision.

This is good because you get to re-use the knowledge base in the future, and also every time f(P_k) gets an update you can bring the conclusions based on that without going through recalling all the details, which saves a lot of time.

Most of the tasks are tiny and “easy” to go and evaluate the logic for through your mind, but once you get a lot of tasks you can’t remember every tiny detail about each one of them. This is what computers are for. 🙂

Prolog Proof search trees

Somewhere I read that writing programs is much like writing mathematical proofs.
Let’s try to understand what’s meant by that, by playing with Prolog.

Consider the following Prolog facts, that is, given definitions:

bigger(cat, mouse).
bigger(mouse, bug).

Now, we can ask Prolog what’s smaller than a cat?

?- bigger(cat, X).
X = mouse.

It of course responds that a mouse is smaller than a cat.
But what about a bug? Why didn’t it report that a bug is smaller than a cat?
This is so because Prolog doesn’t know much about this relation. Other than what is supplied to it, it assumes everything else is untrue. So, for instance, if we did:

?- bigger(cat, bug).
false.

We would get a negative answer.

To fix this, we need to implement a transitive relation of bigger. What this means is that if we have that A·B and B·C, then A·C holds as well, that is, ∀a, b, c ∈ X : (a·b ∧ b·c) ⇒ a·c. Do you see the mathematical similarity here?

In this case, · would be bigger. Let’s call this relation is_bigger.

This is how it looks:

is_bigger(A, B) :- bigger(A, B).
is_bigger(A, B) :- bigger(A, C), is_bigger(C, B).

Doing a couple of tests:

?- is_bigger(cat, mouse).
true .

?- is_bigger(mouse, bug).
true .

?- is_bigger(cat, bug).
true .

It seems to be working fine!

Now let’s try to explain the definition.

The :- operator here is like mathematical if, that is, if we did a :- b, and then ask for the truth value of a, it would report true if b were true as well. So by is_bigger :- bigger, we say that the former is true if the latter.

When you define multiple definitions in Prolog, it tries to find which one of them is true. In this case, it’s acting kind of an OR, it tries the first definition, then the second, and if all of them fail the result would be false, but even one of them succeeds and the result is true. Note that, we could also rewrite the definition like:

is_bigger(A, B) :- bigger(A, B); bigger(A, C), is_bigger(C, B).

That is, rewrite it using the ; operator as an OR. This is also valid, but much less readable.

In contrast, when you do a, b, c. in Prolog (i.e. comma operator), it does an AND, i.e. it first tries to prove a, then b, then c. If one of them fails, the result is false. Otherwise, it’s true.

One could easily notice that this definition is a recursive one, and the first one should be the base case of the recursion. This is correct.

So, calling is_bigger(cat, bug) would fail for the first definition, and then proceeds to the second (recursive) one to see if it can find a true match. So it tries to prove bigger(A, C), is_bigger(C, B), that is, it tries to find a C such that these 2 terms are logically true. But does there exist such C? Let’s try to further expand.

is_bigger(C, B) would first attempt to prove bigger(C, B), that is, bigger(C, bug). C in this case (according to the given facts) would be mouse. This is a possible match. Let’s return to bigger(A, C) and see if we can also fit mouse to it, i.e. is it true that bigger(cat, mouse)? Of course, this is also in the definitions. So for C = mouse, we can prove that bigger(A, C), bigger(C, B) is true. From this, it follows also that is_bigger(cat, bug) is true.

Would you agree that we’ve just written a mathematical proof? Given a few definitions, using transitivity we were able to deduce some fact, using Prolog.

The gap between Prolog and Mathematics is too small, in contrast to other languages, say Python and Mathematics. This is why notation is important. It makes you think in a different way for tasks you likely already solved using other languages.

So, let’s turn to our everyday languages for a while. What we have to deal with is some trees, some equality checking and we should be able to easily implement this in another language. Let’s try Python.

We’re going to write a program that attempts to prove a given search tree, just what Prolog did. With this, I think we will be able to see how writing programs is actually much like writing mathematical proofs.

I won’t go through the details, but here’s how it could look:


class node(object):
def __init__(self, value, children = []):
self.value = value
self.children = children

def contains(self, value):
for x in self.children:
if x.value == value:
return True

return False

def __repr__(self, level=0):
ret = " "*level+repr(self.value)+"\n"
for child in self.children:
ret += child.__repr__(level+1)
return ret

# The relation tree to work with
tree = node("bigger", [
node("cat", [node("mouse"), node("bug")]),
node("mouse", [node("bug")]),
])

# This function would attempt to find a correct proof for the given task
# It parses the tree
def find_proof(tree, A, B):
if tree.value == A and tree.contains(B):
return True

for node in tree.children:
if find_proof(node, A, B):
return True

return False

# Try to find proof
print(find_proof(tree, 'cat', 'mouse'))
print(find_proof(tree, 'mouse', 'bug'))
print(find_proof(tree, 'cat', 'bug'))

It returns:


True
True
True

What would be more interesting is to write a function that would generate the proof tree instead of us manually supplying it 🙂

Playing with the type system

Given these definitions:
(.) :: (b -> c) -> (a -> b) -> a -> c
map :: (a -> b) -> [a] -> [b]
head :: [a] -> a

Find the type of (map . head)

1. (f . g) x = f (g x)
(map . head) x = map (head x)

2. If we set
x :: [a -> b]
then we have
head x :: a -> b

3. \x -> map (head x) :: [a -> b] -> [a] -> [b]
=>
map . head :: [a -> b] -> [a] -> [b]

Let’s now try to find a senseful definition of [a -> b] -> [a] -> [b] by using hole-driven approach (inspired by http://www.youtube.com/watch?v=52VsgyexS8Q):

In this example we are using a silent hole (i.e. undefined). We are also writing the types of each argument (xs and ys)

{-# LANGUAGE ScopedTypeVariables #-}

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = undefined
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

It’s compiling. But it does nothing.

Let’s try to do the same with a noisy hole:

data Hole = Hole

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = Hole
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

We get this error:
Couldn’t match expected type `[b]’ with actual type `Hole’

How do we get from Hole to [b]? We have xs :: [a -> b]. What if instead we tried map?

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = map Hole Hole
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

We now get these errors:
Couldn’t match expected type `a0 -> b’ with actual type `Hole’
Couldn’t match expected type `[a0]’ with actual type `Hole’

The second hole is obviously just ys:

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = map Hole ys
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

Now we get:
Couldn’t match expected type `a -> b’ with actual type `Hole’

What if we try to use just xs instead of Hole?

Couldn’t match expected type `a -> b’ with actual type `[a -> b]’

Now it wants just x, not xs.

One way to get to that is to use (head xs)

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = map (head xs) ys
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

This makes it happy.

Conclusion:
The function of type [a -> b] -> [a] -> [b] is not unique, because we could use something else for map instead of just (head xs).

Bonus:
id’ :: forall a. a -> a
id’ x = Hole
    where
    _ = x :: a

Couldn’t match expected type `a’ with actual type `Hole’

id’ x = x

Folds in imperative languages

Today I was again playing with folds, but this time implementing them in PHP.
So, consider the list [a, b, c]:
foldr f z [a,b,c] = a `f` (b `f` (c `f` z))
foldl f z [a,b,c] = (((a `f` b) `f` c) `f` z) = z `f` (c `f` (a `f` b))

Note the bolded parts of foldr and foldl. They look pretty same, just that the elements are kind of reversed (e.g. (0 – 1 – 2 – 3) against (3 – 2 – 1 – 0)).

Now consider we use array_reduce in PHP like this:


<?php
var_dump(array_reduce(array(1,2,3), function($x, $y) { return $x - $y; } ));

This will print “-6” as a result. And in Haskell:


Prelude> foldl (-) 0 [1,2,3]
-6

So it turns out we’re using a left fold. What can we do to make it a right fold?
What if we try something like


<?php
var_dump(array_reduce(array(1,2,3), function($x, $y) { return $y - $x; } ));

This one will print “2” as a result. Haskell says:


Prelude> foldr (-) 0 [1,2,3]
2

And in Haskell we can also do something like


Prelude> let f x y = y - x
Prelude> foldr f 0 [1,2,3]
-6

So, it’s interesting to play around with associativity of binary operators 🙂
But we don’t want to do it in a “hacky” way by changing the operands in the function. So we want foldr.
Here’s the full PHP code with foldr and foldl implemented:


<?php
interface iBinaryOperator {
public function calc($x, $y);
}

function foldr(iBinaryOperator $f, $z, array $xs) {
try {
$x = array_shift($xs);
if ($x == null) {
return $z;
}
return $f->calc($x, foldr($f, $z, $xs));
} catch (Exception $e) {
return "error";
}
}

// Foldl is tail recursive
function foldl(iBinaryOperator $f, $z, array $xs) {
try {
$x = array_shift($xs);
if ($x == null) {
return $z;
}
return foldl($f, $f->calc($z, $x), $xs);
} catch (Exception $e) {
return "error";
}
}

class Subtraction implements iBinaryOperator {
public function calc($x, $y) {
if (gettype($x) != "integer" || gettype($y) != "integer") {
throw new Exception("MathException");
}
return $x - $y;
}
/*
public function partial_calc($x) {
if (gettype($x) != "integer") {
throw new Exception("MathException");
}
return function ($y) use ($x) {
if (gettype($y) != "integer") {
throw new Exception("MathException");
}
return $x - $y;
};
}
*/
}

$x = new Subtraction();
var_dump(foldr($x, 0, array(1,2,3)));
var_dump(foldl($x, 0, array(1,2,3)));

/*
> php test.php
int(2)
int(-6)
*/

Simple IO stuff

Today we’ll play around with the following task:

Write a program that will receive a number N as an input, and that then will do additional N reads and print them out.

So first, let’s see how we can do this with a static number of times, say 3:

*Main> (readLn :: IO Int) >>= \x -> return $ x : []
1
[1]
*Main> (readLn :: IO Int) >>= \x -> (readLn :: IO Int) >>= \y -> return $ x : y : []
1
2
[1,2]
*Main> (readLn :: IO Int) >>= \x -> (readLn :: IO Int) >>= \y -> (readLn :: IO Int) >>=
\z -> return $ x : y : z : []
1
2
3
[1,2,3]
*Main>

So what we need, is a function that will take a number as an input, and spit list of IO Ints as an output.
There are 2 ways to do this, and we’ll do both and then discuss the difference between them.

pn :: Int -> [IO Int]
pn' :: Int -> IO [Int]

What pn does, is returns a list of IO actions that need to be executed (which we can execute with sequence for example).
pn’ on the other hand, returns an IO action of list of numbers.
So we can view pn’ as one IO action that returns a list of numbers, and pn as a list of IO actions that return a number. Pretty simple!

Let’s start with pn. To cover the base case, we say that pn = 0 will return an empty list of actions. Note that [] is [IO Int] here, not [Int].

pn 0 = []

For the inductive step, we store the *read a number using readLn* action in a list (but not execute it), and then append that action to the list. Note that x is IO Int here, not Int.

pn n = do
-- No need for explicit signature here because Haskell already knows this by the fn signature
let x = readLn :: IO Int
x : pn (n - 1)

Now, to execute this, we can do:

*Main> sequence $ pn 3
1
2
3
[1,2,3]
*Main>

And what is sequence?

*Main> :t sequence
sequence :: Monad m => [m a] -> m [a]
sequence [] = return []
sequence (x:xs) = do v <- x; vs <- sequence xs; return (v:vs)
*Main>

So, sequence takes a list of monads (IO actions in this case), executes each one of them and returns their result *combined*.
pn’ is the same implementation as pn, but with sequence _within_ it so that we don’t have to call sequence each time.

pn' :: Int -> IO [Int]
pn' 0 = return []
pn' n = do
x <- readLn
xs <- pn' (n - 1)
return $ x : xs

For the base case, we need a monadic empty list, that is:

*Main> :t return
return :: Monad m => a -> m a
*Main> :t (return []) :: IO [Int]
(return []) :: IO [Int] :: IO [Int]

For the inductive step, we are reading a number into x, i.e. executing the read action with <- (in contrast to =). Note that x is Int here, not IO Int.
Then, we recursively call pn’ to read for more numbers, and store those executed reads in xs. Note that xs is [Int] here.
After that, we return x:xs in order to get a IO [Int] instead of [Int].
We can now call it like:

*Main> pn' 3
1
2
3
[1,2,3]
*Main>

Let’s try to unwrap this call:

test :: IO [Int]
test = do
x' <- readLn
xs' <- do
x'' <- readLn
xs'' <- do
x''' <- readLn
xs''' <- return []
return $ x''' : xs'''
return $ x'' : xs''
return $ x' : xs'

And that’s how it works. Or, we can rewrite everything as:


Prelude> let pn'' n = sequence $ take n $ repeat (readLn :: IO Int)
Prelude> pn'' 5
1
2
3
4
5
[1,2,3,4,5]
Prelude>

But it’s worth knowing what it does behind the scenes 🙂
We can now extend this function to accept an additional parameter (function), that will do something to the read value:

pn''' :: Int -> (Int -> Int) -> IO [Int]
pn''' 0 _ = return []
pn''' n f = do
x <- readLn
xs <- pn'' (n - 1) f
return $ (f x) : xs

And call it like:

*Main> pn''' 5 succ
1
2
3
4
5
[2,3,4,5,6]
*Main>