Writing a lambda calculus evaluator in Haskell

This tutorial serves as a very short and quick summary of the first few chapters of TAPL.

My previous post was a general overview of how we can design an evaluator and a type checker.

This post is more focused on building the lambda calculus from scratch. It provides an interesting overview of some design decisions particularly for the lambda calculus.

2. Untyped lambda calculus

Lambda calculus is a formal system for expressing computation. The grammar rules are divided in two parts: function abstraction and function application. Function abstraction defines what a function does, and function application “computes” a function. For example f(x) = x + 1 is a function abstraction and f(3) is a function application. The equality sign = is replaced with a dot, and instead of writing f(x) we write \lambda x. To represent f(x) = x + 1 we write \lambda x.x + 1.

It was introduced by the mathematician Alonzo Church in the 1930s as part of his research of the foundations of mathematics.

2.1. Syntax

Our syntax, per BNF is defined as follows:

<var>   ::= x
<abs>   ::= λ<variable>.<term>
<app>   ::= <term><term>

<term>  ::= <var> | <abs> | <app>
<value> ::= <abs>

Note that a value in the system is just an abstraction. In Haskell code:

type VarName = String

data Term =
    TmVar VarName
    | TmAbs VarName Term
    | TmApp Term Term deriving (Eq, Show)

We set variables to additionally hold VarName (which is a string in this case) for pretty printing and comparison of variables.

2.2. Inference rules (evaluator)

The list of inference rules:

NameRule
E-App1\frac{t_1 \to t_1'}{t_1t_2 \to t_1't_2}
E-App2\frac{t_2 \to t_2'}{v_1t_2 \to v_1t_2'}
E-AppAbs(\lambda x.t_{12})v_2 \to [x \mapsto v_2]t_{12}

It looks like we have a new notation here: [x \mapsto v_2]t_{12} means that we change all occurences of x with v_2 within t_{12}.

So the evaluator looks something like:

eval :: Term -> Term
eval (TmApp (TmAbs x t12) v2@(TmAbs _ _)) = subst x t12 v2 -- E-AppAbs
eval (TmApp v1@(TmAbs _ _) t2)            = let t2' = eval t2 in TmApp v1 t2' -- E-App2
eval (TmApp t1 t2)                        = let t1' = eval t1 in TmApp t1' t2 -- E-App1
eval _ = error "No rule applies"

Note how we pattern match against v2@(TmAbs _ _) to detect a variable. We do that because we didn’t “treat” variables separately in our data definition.

The only thing left for us to implement is subst, and we should be good.

TAPL defined it for us in 5.3.5:

FormulaExplanation
[x \mapsto s]x = sRenaming all occurences of “x” to “s” within “x” is just “s”
[x \mapsto s]y = yRenaming all occurences of “x” to “s” within “y” (which doesn’t contain any “x”) is just “x”
[x \mapsto s](\lambda y . t_1) = \lambda y.[x \mapsto s]t_1, y \neq xIn an abstraction, as long as the variable in the argument is different, we can recursively substitute the terms
[x \mapsto s](t_1 t_2) = [x \mapsto s]t_1 [x \mapsto s]t_2, y \neq x \land y \notin FV(s)In an application, we recursively substitute all bound variables

We can convert the definitions to Haskell code easily:

-- subst VarName in Term to Term
subst :: VarName -> Term -> Term -> Term
subst x (TmVar v) newVal
    | x == v    = newVal
    | otherwise = TmVar v
subst x (TmAbs y t1) newVal
    | x == y                                  = TmAbs y t1
    | x /= y && (y `notElem` freeVars newVal) = TmAbs y (subst x t1 newVal)
    | otherwise                               = error $ "Cannot substitute '" ++ show x ++ "' in term '" ++ show (TmAbs y t1) ++ "'"
subst x (TmApp t1 t2) newVal = TmApp (subst x t1 newVal) (subst x t2 newVal)

But what is freeVars? Well, it’s the list of free variables in an expression. For example, in \lambda x.x \ y, we have that x is bound (because it appears in an argument in an abstraction), and y is free. TAPL already defined it as FV in 5.3.2:

FormulaExplanation
FV(x) = \{x\}A single variable is free
FV(\lambda x.t_1) = FV(t_1) \setminus \{x\}Free variables in a term within a lambda abstraction are all free variables of the term except the one bound by the abstraction
FV(t_1 t_2) = FV(t_1) \cup FV(t_2)Free variables of application is the union of the free variables of both terms

Implementation in Haskell is also straight-forward with this:

freeVars :: Term -> [VarName]
freeVars (TmVar x) = [x]
freeVars (TmAbs x t1) = nub (freeVars t1) \\ [x]
freeVars (TmApp t1 t2) = freeVars t1 ++ freeVars t2

Running a few examples:

Main> let lctest1 = (TmAbs "f" (TmAbs "x" (TmVar "x")))
Main> let lctest2 = (TmAbs "f" (TmAbs "x" (TmVar "y")))
Main> eval $ TmApp lctest1 lctest2
TmAbs "x" (TmVar "x")
Main> eval $ TmApp lctest2 lctest1
TmAbs "x" (TmVar "y")
Main> eval (TmApp (eval $ TmApp lctest2 lctest1) lctest1)
TmVar "y"

Note, however:

Main> eval $ TmApp (TmAbs "x" (TmAbs "y" (TmVar "y"))) (TmAbs "x" (TmVar "y"))
*** Exception: Cannot substitute '"x"' in term 'TmAbs "y" (TmVar "y")'

This happens by design, because we chose our system to work up to renaming of bound variables (the otherwise case in subst for abstraction). So, subst isn’t total. Church fixed this issue by using alpha conversion – which is just renaming bound variables. So for example \lambda x.x could be re-written as \lambda y.y.

Now we can “fix” our evaluation error by alpha conversion:

Main> eval $ TmApp (TmAbs "x" (TmAbs "z" (TmVar "z"))) (TmAbs "x" (TmVar "y"))
TmAbs "z" (TmVar "z")

There’s an alternative approach to fix this issue: De Bruijn indices.

2.3. Bonus: De Bruijn indices

The point with De Bruijn indices is that they remove the need of alpha conversion. They allow that by removing the variable argument in the lambda abstraction, and rely on integers (indices) to refer to variables. The index actually denotes the number of binders that are in scope between that occurrence and its corresponding binder.

For example, we have that \lambda f . \lambda x . x = \lambda \lambda 0 because the distance from the x in the application to the abstraction is 0. Similar reasoning for \lambda f . \lambda x . f = \lambda \lambda 1, etc.

We start with the following data definitions:

type VarNameB = Int
type Indices = (VarName, Int)

data TermB =
    TmVarB VarNameB
    | TmAbsB TermB
    | TmAppB TermB TermB deriving (Eq, Show)

We will also create a function that will convert an old term Term to a De Bruijn’s term TermB:

bruijn :: Term -> Maybe TermB
bruijn e = go [] e where
    go :: [Indices] -> Term -> TermB
    go m (TmAbs x e)  = TmAbsB $ let m' = map (\(x, y) -> (x, y + 1)) m in go ((x, 0) : m') e
    go m (TmApp e e') = TmAppB (go m e) (go m e')
    go m (TmVar x)    = TmVarB $ findByFst m x
    findByFst :: [Indices] -> VarName -> Int
    findByFst ((x, y) : xs) x' = if x' == x then y else findByFst xs x'
    findByFst _ _              = error "Variable not found"

Running it:

Main> bruijn (TmAbs "f" (TmAbs "x" (TmVar "x")))
TmAbsB (TmAbsB (TmVarB 0))

However:

Main> bruijn (TmAbs "f" (TmAbs "x" (TmVar "y")))
TmAbsB (TmAbsB (TmVarB *** Exception: Variable not found

This could be fixed by the function returning a Maybe TermB, or exposing go to consider the list of free variables and do the proper substitution. However, doing this or writing an evaluator for this system is left as an exercise to the reader 😀

2.4. Conclusion

The lambda calculus has a very minimal system, yet is a bit trickier to implement than the previous system we’ve seen. When we introduce function abstraction and application, we introduce complexity. We needed to account for contextual substitution.

An interesting question would be: is there a system that is simpler to implement, yet as powerful as lambda calculus? Maybe combinatory logic? But it’s not really pleasant to use as it’s not readable to us. It’s almost as what we essentially did is choose the minimal “most” readable system, and solve the side-effects (i.e. solving the issue of alpha conversion) of reading/parsing those systems that we humans have as a limitation. I mean, if it was easy for humans to parse SKIKSIKSKS easily, we wouldn’t use lambda calculus (which requires us as language designers to solve all the side-effects due to complexity), right?

In the next tutorial we’ll be implementing a type-checker for the lambda calculus – the typed lambda calculus.

7 thoughts on “Writing a lambda calculus evaluator in Haskell

  1. Thank you for posting this.
    It was quite useful as I was about to implement this myself as an educational exercise.
    I am getting stuck trying to evaluate:
    “`
    TmApp (TmAbs (TmVar “x”) (TmVar “x”)) (TmVar “y”)
    “`
    In otherwords the id function applied to a variable y:
    “`
    (\x -> x) y
    “`
    which I think should simply return y, but the cases do not seem to cover this.
    What am I missing?

    E-App1: would simply result in an error, since the first parameter that is being evaluated is an TmAbs, not a TmApp.
    E-App2: again results in an error, since the second parameter which is being evaluated is a TmVar, not a TmApp.
    E-AppAbs: does not match.

    In the book on page 72, it says E-App1 and E-App2 should not be applied, since t1 and t2 respectively can not take a step.
    “but the premise further requires that t_1 can take a step”

    I also cannot find the answer to this in the book, but maybe I am not searching well enough.

    Liked by 1 person

    1. Sorry for the late reply. I believe it’s because the design is such that it expects Church numerals? For example, you could return 0 (\f x -> x):

      “`
      > eval $ TmApp (TmAbs “x” (TmVar “x”)) (TmAbs “f” (TmAbs “x” (TmVar “x”)))
      TmAbs “f” (TmAbs “x” (TmVar “x”))
      “`

      Alternatively, you can edit the rules as follows to calculate that expression:

      “`
      eval :: Term -> Term
      eval (TmApp (TmAbs x t12) v2) = subst x t12 v2 — E-AppAbs
      eval (TmApp t1 t2) = let t1′ = eval t1 in TmApp t1′ t2 — E-App1
      eval (TmVar x) = TmVar x
      eval _ = error “No rule applies”
      “`

      Like

  2. “freeVars (TmAbs x t1) = freeVars t1 \\ [x]” does not work for the term Abs “a” (App (Var “a”) (Var “a”))

    Liked by 1 person

  3. Thanks for this, it is quite useful for a school project I’m working on.

    A comment on the following sentence:

    “Note how we pattern match against v2@(TmAbs _ _) to detect a variable. We do that because we didn’t “treat” variables separately in our data definition.”

    I believe that in this sentence you intend to talk about a ‘value’ where you’ve written ‘variable’.

    Like

Leave a reply to Walter Schulze Cancel reply