Svpion Fifth problem

I attempted myself to solve #5 from Five programming problems every Software Engineer should be able to solve in less than 1 hour in Haskell.

So the task is:
Write a program that outputs all possibilities to put + or – or nothing between the numbers 1, 2, …, 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100.

Initially I thought,
“Hey, this should be easy. All we need to do is create an algebra that contains Addition, Subtraction, and Multiplication (being concatenation) and we should be good!”

So I proceeded:

data Expr =
    A Expr Expr |
    S Expr Expr |
    C Expr Expr |
    I Int deriving Show

parse :: Expr -> Int
parse (A a b) = parse a + parse b
parse (S a b) = parse a - parse b
parse (C a b) = parse a * 10 + parse b
parse (I a) = a

This should be it! So for example 1 + 2 + 3 would be A (I 1) (A (I 2) (I 3)), and getting the result of it is to call parse like:

*Main> parse $ A (I 1) (A (I 2) (I 3))
6

Now all we need to do is create all possibilities of the form (1 _ 2 _ 3 _ 4 _ 5 _ 6 _ 7 _ 8 _ 9) and replace _ with any of the three operators and then finally do something like filter (== 100) parse x.

But using this grammar, there are some non-valid cases such as C (A (I 1) (I 2)) (I 3).
These are cases that we must exclude, for if we don’t, then what this expression would evaluate to is 33, i.e. (1 + 2) * 10 + 3, but this is not a valid expression given the stated task.
However 1 + 23 or 12 + 3 or 1 + 2 + 3 are.

To take care of this, we will slightly rework our grammar and parsing functions:

data Expr =
    A Expr Expr |
    S Expr Expr |
    CE CExpr deriving Show

data CExpr =
    C CExpr CExpr |
    I Int deriving Show

parse :: Expr -> Int
parse (A a b) = parse a + parse b
parse (S a b) = parse a - parse b
parse (CE c) = parseCE c

parseCE :: CExpr -> Int
parseCE (C a b) = parseCE a * 10 + parseCE b
parseCE (I a) = a

Another constraint that we need to add is that the concatenations need to be successive. So we somehow need to exclude those cases as well from all of the possibilities. Let’s call this function getCExprs. So what getCExprs should do is, given a list, it should return possible successive concatenations of that list. Successive concatenations is what will allow us to remove the non-valid cases.

E.g. getCExprs [I 1,I 2,I 3] = [I 1,C (I 1) (I 2),C (C (I 1) (I 2)) (I 3)].

Additionally (we’ll see why within the foldingFunction), we want getCExprs to return the remaining part of the list (digits) it’s working on, so:

getCExprs [I 1,I 2,I 3] = [([I 2,I 3],I 1),([I 3],C (I 1) (I 2)),([],C (C (I 1) (I 2)) (I 3))].

To implement this, we’ll need a help function called listToC that given a list [I 1,I 2,I 3], it will turn it into its concatenated algebraic version, C (C (I 1) (I 2)) (I 3).

The definition for this is trivial:

listToC :: [CExpr] -> CExpr
listToC (x:xs) = foldl C x xs
listToC _      = I 0

Now we are ready to go for getCExprs:

getCExprs :: [CExpr] -> [([CExpr], CExpr)]
getCExprs x = go 1 (tail $ inits x)
    where
    go n xs@(x':xs') = (drop n x, listToC x') : go (n + 1) xs'
    go _ []          = []

Some simple tests:

*Main> getCExprs $ map I [1,2,3]
[([I 2,I 3],I 1),([I 3],C (I 1) (I 2)),([],C (C (I 1) (I 2)) (I 3))]
*Main> getCExprs $ map I [1..9]
[([I 2,I 3,I 4,I 5,I 6,I 7,I 8,I 9],I 1),([I 3,I 4,I 5,I 6,I 7,I 8,I 9],C (I 1) (I 2)),([I 4,I 5,I 6,I 7,I 8,I 9],C (C (I 1) (I 2)) (I 3)),([I 5,I 6,I 7,I 8,I 9],C (C (C (I 1) (I 2)) (I 3)) (I 4)),([I 6,I 7,I 8,I 9],C (C (C (C (I 1) (I 2)) (I 3)) (I 4)) (I 5)),([I 7,I 8,I 9],C (C (C (C (C (I 1) (I 2)) (I 3)) (I 4)) (I 5)) (I 6)),([I 8,I 9],C (C (C (C (C (C (I 1) (I 2)) (I 3)) (I 4)) (I 5)) (I 6)) (I 7)),([I 9],C (C (C (C (C (C (C (I 1) (I 2)) (I 3)) (I 4)) (I 5)) (I 6)) (I 7)) (I 8)),([],C (C (C (C (C (C (C (C (I 1) (I 2)) (I 3)) (I 4)) (I 5)) (I 6)) (I 7)) (I 8)) (I 9))]
*Main>

Seems to be working fine!

Next, we want to create a function f that has three parameters:
1. Current expression calculated with add/sub so far, Expr
2. Current operation being done, String
3. Remaining list of expressions (digits and successive concatenated digits) to work on, [CExpr]
and this function should return a list of (Expr, String), i.e. which expression is produced for what operations.

So we have:

f :: Expr -> [String] -> [CExpr] -> [(Expr, [String])]

This should be a fold, so what f should do is basically go through all valid possibilities (which are created by getCExprs), so, what we have so far:

f s ops [] = [(s, ops)]
f s ops xs = foldr foldingFunction [] $ getCExprs xs

In the first definition of f, we pattern match against an empty list, which is basically the base case and it returns the last pair of (sum, operations) done at that point.

So, this is what we currently have:

import Data.List (inits, nub)

data Expr =
    A Expr Expr |
    S Expr Expr |
    CE CExpr deriving Show

data CExpr =
    C CExpr CExpr |
    I Int deriving Show

parse :: Expr -> Int
parse (A a b) = parse a + parse b
parse (S a b) = parse a - parse b
parse (CE c) = parseCE c

parseCE :: CExpr -> Int
parseCE (C a b) = parseCE a * 10 + parseCE b
parseCE (I a) = a

listToC :: [CExpr] -> CExpr
listToC (x:xs) = foldl C x xs
listToC _      = I 0

getCExprs :: [CExpr] -> [([CExpr], CExpr)]
getCExprs x = go 1 (tail $ inits x)
    where
    go n xs@(x':xs') = (drop n x, listToC x') : go (n + 1) xs'
    go _ []          = []

f :: Expr -> [String] -> [CExpr] -> [(Expr, [String])]
f s ops [] = [(s, ops)]
f s ops xs = foldr foldingFunction [] $ getCExprs xs
    where
    foldingFunction = undefined

So, all we need to do is implement foldingFunction and we are done.

To be able to implement foldingFunction, we need to look at getCExprs and see what it produces for us. We know that it gives us back a pair, ([CExpr], CExpr). CExpr is the current concatenations done, and [CExpr] is the remaining part of the list.

Therefore,
foldingFunction (a, b) l = undefined

Remember f had three params. The way we defined f makes it easily callable by the foldingFunction.

We need to call f from within the foldingFunction and add the current value we are iterating to the sum. We also need to note which operation we are applying, and to pass the current list of digits we are working on. Note that we have the variable s (expression calculated so far) in the scope since we will define foldingFunction within f itself. We also have the variable b produced by getCExprs, but its type is CExpr. if b :: CExpr, then CE b will be Expr, which is what our function f requires, i.e. to parse a complete expression (Expr), and not just digits or concatenated digits (CExpr).

So:
foldingFunction (a, b) l = f (A s (CE b)) (“+” : ops) a

In this case, we are calling f while adding s and b, i.e., we add b to the current expression so far, and then we pass the remaining list of numbers (a) to f.

This takes care of the addition. To make messages more verbose, we’ll implement a function called “calculated” which in details will explain what’s going on:

foldingFunction (a, b) l = f (A s (CE b)) (calculated “+” b) (drop a xs)
calculated op b = (op ++ show (parseCE b)) : ops

Similarly, we need to do the same for the operation minus. And then we need to append all of the results in a single list:
foldingFunction (a, b) l =
f (A s (CE b)) (calculated “+” b) a
++
f (S s (CE b)) (calculated “-” b) a
++ l

So the full code is:

import Data.List (nub)

data Expr =
    A Expr Expr |
    S Expr Expr |
    CE CExpr deriving Show

data CExpr =
    C CExpr CExpr |
    I Int deriving Show

parse :: Expr -> Int
parse (A a b) = parse a + parse b
parse (S a b) = parse a - parse b
parse (CE c) = parseCE c

parseCE :: CExpr -> Int
parseCE (C a b) = parseCE a * 10 + parseCE b
parseCE (I a) = a

listToC :: [CExpr] -> CExpr
listToC (x:xs) = foldl C x xs
listToC _      = I 0

getCExprs xs = map (\x -> (drop x xs, listToC (take x xs))) [1..length xs]

f :: Expr -> [String] -> [CExpr] -> [(Expr, [String])]
f s ops [] = [(s, ops)]
f s ops xs = foldr foldingFunction [] $ getCExprs xs
    where
    foldingFunction (a, b) l = f (A s (CE b)) (calculated "+" b) a
        ++ f (S s (CE b)) (calculated "-" b) a
        ++ l
    calculated op b          = (op ++ show (parseCE b)) : ops

main = do
    let l = f (CE (I 0)) [] (map I [1..9])
    let l' = filter (\x -> parse (fst x) == 100) l
    mapM_ (\(x, y) -> print $ concat $ reverse y) l'

Call main to get:

[1 of 1] Compiling Main             ( test.hs, interpreted )
iOk, modules loaded: Main.
*Main> main
"+1+2+3-4+5+6+78+9"
"+1+2+34-5+67-8+9"
"+1+23-4+5+6+78-9"
"+1+23-4+56+7+8+9"
"-1+2-3+4+5+6+78+9"
"+12+3+4+5-6-7+89"
"+12+3-4+5+67+8+9"
"+12-3-4+5-6+7+89"
"+123+4-5+67-89"
"+123-4-5-6-7+8-9"
"+123+45-67+8-9"
"+123-45-67+89"
*Main>

Solving this problem in a functional language like Haskell reveals its background when it’s represented using an algebraic data type. For instance, if this problem was solved in Python, you would run through all combinations of “1_2_3_4_5_6_7_8_9”, change underscores with plus, minus, or append, and then eval the string expression. But if you solve it this way, you wouldn’t have any deeper insight regarding its algebra, e.g. the “successive concatenations” part might not be immediately visible.

Algebraic data types representation helps us with adding constraints, but to some point. Additional more complex constraints were handled by the functions themselves.

This is why I do not believe this is solvable in under 1 hour when you first meet this problem. There is much more background in this than what the author states.

Playing with the type system

Given these definitions:
(.) :: (b -> c) -> (a -> b) -> a -> c
map :: (a -> b) -> [a] -> [b]
head :: [a] -> a

Find the type of (map . head)

1. (f . g) x = f (g x)
(map . head) x = map (head x)

2. If we set
x :: [a -> b]
then we have
head x :: a -> b

3. \x -> map (head x) :: [a -> b] -> [a] -> [b]
=>
map . head :: [a -> b] -> [a] -> [b]

Let’s now try to find a senseful definition of [a -> b] -> [a] -> [b] by using hole-driven approach (inspired by http://www.youtube.com/watch?v=52VsgyexS8Q):

In this example we are using a silent hole (i.e. undefined). We are also writing the types of each argument (xs and ys)

{-# LANGUAGE ScopedTypeVariables #-}

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = undefined
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

It’s compiling. But it does nothing.

Let’s try to do the same with a noisy hole:

data Hole = Hole

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = Hole
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

We get this error:
Couldn’t match expected type `[b]’ with actual type `Hole’

How do we get from Hole to [b]? We have xs :: [a -> b]. What if instead we tried map?

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = map Hole Hole
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

We now get these errors:
Couldn’t match expected type `a0 -> b’ with actual type `Hole’
Couldn’t match expected type `[a0]’ with actual type `Hole’

The second hole is obviously just ys:

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = map Hole ys
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

Now we get:
Couldn’t match expected type `a -> b’ with actual type `Hole’

What if we try to use just xs instead of Hole?

Couldn’t match expected type `a -> b’ with actual type `[a -> b]’

Now it wants just x, not xs.

One way to get to that is to use (head xs)

f :: forall a b. [a -> b] -> [a] -> [b]
f xs ys = map (head xs) ys
    where
    _ = xs :: [a -> b]
    _ = ys :: [a]

This makes it happy.

Conclusion:
The function of type [a -> b] -> [a] -> [b] is not unique, because we could use something else for map instead of just (head xs).

Bonus:
id’ :: forall a. a -> a
id’ x = Hole
where
_ = x :: a

Couldn’t match expected type `a’ with actual type `Hole’

id’ x = x

Folds in imperative languages

Today I was again playing with folds, but this time implementing them in PHP.
So, consider the list [a, b, c]:
foldr f z [a,b,c] = a `f` (b `f` (c `f` z))
foldl f z [a,b,c] = (((a `f` b) `f` c) `f` z) = z `f` (c `f` (a `f` b))

Note the bolded parts of foldr and foldl. They look pretty same, just that the elements are kind of reversed (e.g. (0 – 1 – 2 – 3) against (3 – 2 – 1 – 0)).

Now consider we use array_reduce in PHP like this:


<?php
var_dump(array_reduce(array(1,2,3), function($x, $y) { return $x - $y; } ));

This will print “-6” as a result. And in Haskell:


Prelude> foldl (-) 0 [1,2,3]
-6

So it turns out we’re using a left fold. What can we do to make it a right fold?
What if we try something like


<?php
var_dump(array_reduce(array(1,2,3), function($x, $y) { return $y - $x; } ));

This one will print “2” as a result. Haskell says:


Prelude> foldr (-) 0 [1,2,3]
2

And in Haskell we can also do something like


Prelude> let f x y = y - x
Prelude> foldr f 0 [1,2,3]
-6

So, it’s interesting to play around with associativity of binary operators 🙂
But we don’t want to do it in a “hacky” way by changing the operands in the function. So we want foldr.
Here’s the full PHP code with foldr and foldl implemented:


<?php
interface iBinaryOperator {
    public function calc($x, $y);
}

function foldr(iBinaryOperator $f, $z, array $xs) {
    try {
        $x = array_shift($xs);
        if ($x == null) {
            return $z;
        }
        return $f->calc($x, foldr($f, $z, $xs));
    } catch (Exception $e) {
        return "error";
    }
}

// Foldl is tail recursive
function foldl(iBinaryOperator $f, $z, array $xs) {
    try {
        $x = array_shift($xs);
        if ($x == null) {
            return $z;
        }
        return foldl($f, $f->calc($z, $x), $xs);
    } catch (Exception $e) {
        return "error";
    }
}

class Subtraction implements iBinaryOperator {
    public function calc($x, $y) {
        if (gettype($x) != "integer" || gettype($y) != "integer") {
            throw new Exception("MathException");
        }
        return $x - $y;
    }
/*
    public function partial_calc($x) {
        if (gettype($x) != "integer") {
            throw new Exception("MathException");
        }
        return function ($y) use ($x) {
            if (gettype($y) != "integer") {
                throw new Exception("MathException");
            }
            return $x - $y;
        };
    }
*/
}

$x = new Subtraction();
var_dump(foldr($x, 0, array(1,2,3)));
var_dump(foldl($x, 0, array(1,2,3)));

/*
> php test.php
int(2)
int(-6)
*/

Simple IO stuff

Today we’ll play around with the following task:

Write a program that will receive a number N as an input, and that then will do additional N reads and print them out.

So first, let’s see how we can do this with a static number of times, say 3:

*Main> (readLn :: IO Int) >>= \x -> return $ x : []
1
[1]

*Main> (readLn :: IO Int) >>= \x -> (readLn :: IO Int) >>= \y -> return $ x : y : []
1
2
[1,2]
*Main> (readLn :: IO Int) >>= \x -> (readLn :: IO Int) >>= \y -> (readLn :: IO Int) >>=
  \z -> return $ x : y : z : []
1
2
3
[1,2,3]
*Main>

So what we need, is a function that will take a number as an input, and spit list of IO Ints as an output.
There are 2 ways to do this, and we’ll do both and then discuss the difference between them.

pn :: Int -> [IO Int]
pn' :: Int -> IO [Int]

What pn does, is returns a list of IO actions that need to be executed (which we can execute with sequence for example).
pn’ on the other hand, returns an IO action of list of numbers.
So we can view pn’ as one IO action that returns a list of numbers, and pn as a list of IO actions that return a number. Pretty simple!

Let’s start with pn. To cover the base case, we say that pn = 0 will return an empty list of actions. Note that [] is [IO Int] here, not [Int].

pn 0 = []

For the inductive step, we store the *read a number using readLn* action in a list (but not execute it), and then append that action to the list. Note that x is IO Int here, not Int.

pn n = do
-- No need for explicit signature here because Haskell already knows this by the fn signature
  let x = readLn :: IO Int
  x : pn (n - 1)

Now, to execute this, we can do:

*Main> sequence $ pn 3
1
2
3
[1,2,3]
*Main>

And what is sequence?

*Main> :t sequence
sequence :: Monad m => [m a] -> m [a]
sequence []     = return []
sequence (x:xs) = do v <- x; vs <- sequence xs; return (v:vs)
*Main>

So, sequence takes a list of monads (IO actions in this case), executes each one of them and returns their result *combined*.
pn’ is the same implementation as pn, but with sequence _within_ it so that we don’t have to call sequence each time.

pn' :: Int -> IO [Int]
pn' 0 = return []
pn' n = do
  x <- readLn
  xs <- pn' (n - 1)
  return $ x : xs

For the base case, we need a monadic empty list, that is:

*Main> :t return
return :: Monad m => a -> m a
*Main> :t (return []) :: IO [Int]
(return []) :: IO [Int] :: IO [Int]

For the inductive step, we are reading a number into x, i.e. executing the read action with <- (in contrast to =). Note that x is Int here, not IO Int.
Then, we recursively call pn’ to read for more numbers, and store those executed reads in xs. Note that xs is [Int] here.
After that, we return x:xs in order to get a IO [Int] instead of [Int].
We can now call it like:

*Main> pn' 3
1
2
3
[1,2,3]
*Main>

Let’s try to unwrap this call:

test :: IO [Int]
test = do
  x' <- readLn
  xs' <- do
    x'' <- readLn
    xs'' <- do
      x''' <- readLn
      xs''' <- return []
      return $ x''' : xs'''
    return $ x'' : xs''
  return $ x' : xs'

And that’s how it works. Or, we can rewrite everything as:


Prelude> let pn'' n = sequence $ take n $ repeat (readLn :: IO Int)
Prelude> pn'' 5
1
2
3
4
5
[1,2,3,4,5]
Prelude>

But it’s worth knowing what it does behind the scenes 🙂
We can now extend this function to accept an additional parameter (function), that will do something to the read value:

pn''' :: Int -> (Int -> Int) -> IO [Int]
pn''' 0 _ = return []
pn''' n f = do
  x <- readLn
  xs <- pn'' (n - 1) f
  return $ (f x) : xs

And call it like:

*Main> pn''' 5 succ
1
2
3
4
5
[2,3,4,5,6]
*Main>

Vector and Monoids

Today I was playing around with Haskell and Monoids, and here’s what I came up with:

import Data.Monoid

data Vector = Vector (Int, Int) deriving (Show)
newtype SumVector = SumVector Vector deriving (Show)
newtype ProductVector = ProductVector Vector deriving (Show)

instance Monoid (SumVector) where
    mempty = SumVector $ Vector (0, 0)
    SumVector (Vector (a, b)) `mappend` SumVector (Vector (c, d)) = SumVector $ Vector (a+c, b+d)

instance Monoid (ProductVector) where
    mempty = ProductVector $ Vector (1, 1)
    ProductVector (Vector (a, b)) `mappend` ProductVector (Vector (c, d)) = ProductVector $ Vector (a*c, b*d)

What I was interested in, what is the difference between newtype and data? Since when I replaced newtype with data, the code worked as expected.
Here’s what I got as an answer:

 what is the main difference between data and newtype?
 Bor0: newtype has the same runtime representation as the underlying type
 data can do sum types (using |) and multi-element records
 you'd use newtype for performance reasons, when it is possible to do so
 also, you can use GeneralizedNewtypeDeriving to pull existing instances into your new type
 e.g., suppose you have data Foobar { ... } and instance ToJSON Foobar
 then you can do newtype Baz = Baz Foobar deriving (ToJSON)
 you can't do that with data Baz = Baz Foobar