Vector and Monoids

Today I was playing around with Haskell and Monoids, and here’s what I came up with:

import Data.Monoid

data Vector = Vector (Int, Int) deriving (Show)
newtype SumVector = SumVector Vector deriving (Show)
newtype ProductVector = ProductVector Vector deriving (Show)

instance Monoid (SumVector) where
    mempty = SumVector $ Vector (0, 0)
    SumVector (Vector (a, b)) `mappend` SumVector (Vector (c, d)) = SumVector $ Vector (a+c, b+d)

instance Monoid (ProductVector) where
    mempty = ProductVector $ Vector (1, 1)
    ProductVector (Vector (a, b)) `mappend` ProductVector (Vector (c, d)) = ProductVector $ Vector (a*c, b*d)

What I was interested in, what is the difference between newtype and data? Since when I replaced newtype with data, the code worked as expected.
Here’s what I got as an answer:

 what is the main difference between data and newtype?
 Bor0: newtype has the same runtime representation as the underlying type
 data can do sum types (using |) and multi-element records
 you'd use newtype for performance reasons, when it is possible to do so
 also, you can use GeneralizedNewtypeDeriving to pull existing instances into your new type
 e.g., suppose you have data Foobar { ... } and instance ToJSON Foobar
 then you can do newtype Baz = Baz Foobar deriving (ToJSON)
 you can't do that with data Baz = Baz Foobar

foldl and foldr

First, we start with the definitions:
foldr:

foldr f z (x:xs) = f x (foldr f z xs) (*)
foldr _ z []     = z                 (**)

foldl:

foldl f z (x:xs) = foldl f (f z x) xs (*)
foldl _ z []     = z                 (**)

foldr starts from the right and goes on to the left (right-associative), while foldl does exactly the opposite.
As usual, it’s best to see with examples. Assume generic list of [a, b, c] and generic function f:

Case foldr:

foldr f z (x:xs) = f x (foldr f z xs)
  => foldr f z [a, b, c]
  = f a (foldr f z [b, c])         (by (*))
  = f a (f b (foldr f z [c])       (by (*))
  = f a (f b (f c (foldr f z []))) (by (*) and (**))
  = f a (f b (f c z))

Case foldl:

foldl f z (x:xs) = foldl f (f z x) xs
  => foldl f z [a, b, c]           (by (*))
  = foldl f (f z a) [b, c]         (by (*))
  = foldl f (f (f z a) b) [c]      (by (*))
  = foldl f (f (f (f z a) b) c) [] (by (*) and (**))
  = (f (f (f z a) b) c)

So we’ve shown the associativity of both foldr and foldl.
Why do we need both foldr and foldl? Because of associativity of course. There are some cases where we’d want to use foldr, and some cases where we’d want to use foldl.
Consider some f (binary function) which is not associative (e.g. ‘-‘). Try evaluating:


*Main> foldr (-) 0 [1,2,3]
2
*Main> foldl (-) 0 [1,2,3]
-6

So, what really inspired me to write this article is one of the examples given in LYAH:


solveRPN x = foldl foldingFunction [] $ words x
    where   foldingFunction (x:y:ys) "*" = (x * y):ys
            foldingFunction (x:y:ys) "+" = (x + y):ys
            foldingFunction (x:y:ys) "-" = (y - x):ys
            foldingFunction xs numberString = read numberString:xs

This is a very elegant solution. I tried writing a RPN calculator some time ago in JS (imperative style) and it involved a lot of checks for side cases, while in Haskell where we have the definitions of all functions that we are using, there is no room for side cases. We know exactly what we are trying to evaluate.

If you want a thorough explanation of the solveRPN function, you can check it at http://learnyouahaskell.com/functionally-solving-problems, while here I’ll only give a brief explanation.

Our foldingFunction acts as a stack storage, and accepts two parameters (the first parameter being the initial parameter that we pass (an empty stack), and the second one being the current element taken from the list), as foldl expects. So, in the where clause we are actually pattern matching on foldingFunction .

In the case where we have two (or more) numbers stored into our stack (x:y:ys) and we arrive at “*” as the current element of the input list, then we multiply those two numbers and store the result back in the stack. Likewise is the definition for “+” and “-“. However, if we match none of the specified operators, then we just store it back into our stack as a number.

Class and Instance

Today, we want a generic Animal class, so that we can instantiate animals out of it in a easy way. How can we do that?
First, we’ll want to create our generic class. We do it this way:

class Animal c where
    doSomething :: c -> String
    name :: c -> String

But we have a problem here. We can’t do

class Animal c where
    doSomething :: String
    name :: String

Because Haskell forces us to use the type that will be instantiated in our instances. We’ll cover on that later (*).
So let’s make our Cat and Dog instances now

data Dog = Dog
data Cat = Cat

instance Animal Dog where
    doSomething _ = "Woof!"
    -- name = "A dog" -- you can't do this
    name _ = "A dog"

instance Animal Cat where
    doSomething _ = "Meow!"
    name _ = "A cat"

Now we can do some fancy stuff like

*Main> doSomething Dog
"Woof!"
*Main> doSomething Cat
"Meow!"
*Main> name Dog
"A dog"
*Main> name Cat
"A cat"
*Main>

So, basically, what we just wrote is “syntactic sugar” for

data DogOrCat = Dog' | Cat'

name' Dog' = "A dog"
name' Cat' = "A cat"

doSomething' Dog' = "Woof!"
doSomething' Cat' = "Meow!"

But the problem with this approach is that, whenever we want to instantiate another animal, we’d have to change the already existing structure (DogOrCat). We don’t want that.
We also risk in calling name’ for an Animal which may not have an instance. Instance forces you to instantiate all functions in the Class.
(*) Basically, this is what triggered this post: I had a discussion with a friend of mine today, and he said that you can’t do something like this in Haskell:

data Dog = Dog { name :: String; age :: Int, bark :: BarkKind }
data Cat = Cat { name :: String, age :: Int, meow :: MeowKind } -- error.

Which is correct. The reason for this is the namespace. name and age (declared in Dog) are global functions, and we are trying to re-declare them in Cat.
He wanted this kind of structure in order to be able to parse some JSON stuff.
I headed over to #haskell and had a discussion, where one guy had this implementation as a suggestion:

instance FromJSON Dog where
 parseJSON (Object v) =
    Dog  v .: "name"
         v .: "bites"
 parseJSON _ = mzero

instance FromJSON Cat where
 parseJSON (Object v) =
    Cat  v .: "name"
         v .: "likes-humans"
 parseJSON _ = mzero

This is using the Aeson library. Basically, that’s what Aeson did. They declared some generic class, so that we can make instances out of it in a elegant way where types will not conflict with each other.