This tutorial serves as a very short and quick summary of the first few chapters of TAPL.
My previous post was a general overview of how we can design an evaluator and a type checker.
This post is more focused on building the lambda calculus from scratch. It provides an interesting overview of some design decisions particularly for the lambda calculus.
2. Untyped lambda calculus
Lambda calculus is a formal system for expressing computation. The grammar rules are divided in two parts: function abstraction and function application. Function abstraction defines what a function does, and function application “computes” a function. For example is a function abstraction and
is a function application. The equality sign = is replaced with a dot, and instead of writing
we write
. To represent
we write
.
It was introduced by the mathematician Alonzo Church in the 1930s as part of his research of the foundations of mathematics.
2.1. Syntax
Our syntax, per BNF is defined as follows:
<var> ::= x
<abs> ::= λ<variable>.<term>
<app> ::= <term><term>
<term> ::= <var> | <abs> | <app>
<value> ::= <abs>
Note that a value in the system is just an abstraction. In Haskell code:
type VarName = String
data Term =
TmVar VarName
| TmAbs VarName Term
| TmApp Term Term deriving (Eq, Show)
We set variables to additionally hold VarName (which is a string in this case) for pretty printing and comparison of variables.
2.2. Inference rules (evaluator)
The list of inference rules:
| Name | Rule |
| E-App1 | |
| E-App2 | |
| E-AppAbs |
It looks like we have a new notation here: means that we change all occurences of
with
within
.
So the evaluator looks something like:
eval :: Term -> Term eval (TmApp (TmAbs x t12) v2@(TmAbs _ _)) = subst x t12 v2 -- E-AppAbs eval (TmApp v1@(TmAbs _ _) t2) = let t2' = eval t2 in TmApp v1 t2' -- E-App2 eval (TmApp t1 t2) = let t1' = eval t1 in TmApp t1' t2 -- E-App1 eval _ = error "No rule applies"
Note how we pattern match against v2@(TmAbs _ _) to detect a variable. We do that because we didn’t “treat” variables separately in our data definition.
The only thing left for us to implement is subst, and we should be good.
TAPL defined it for us in 5.3.5:
| Formula | Explanation |
| Renaming all occurences of “x” to “s” within “x” is just “s” | |
| Renaming all occurences of “x” to “s” within “y” (which doesn’t contain any “x”) is just “x” | |
| In an abstraction, as long as the variable in the argument is different, we can recursively substitute the terms | |
| In an application, we recursively substitute all bound variables |
We can convert the definitions to Haskell code easily:
-- subst VarName in Term to Term
subst :: VarName -> Term -> Term -> Term
subst x (TmVar v) newVal
| x == v = newVal
| otherwise = TmVar v
subst x (TmAbs y t1) newVal
| x == y = TmAbs y t1
| x /= y && (y `notElem` freeVars newVal) = TmAbs y (subst x t1 newVal)
| otherwise = error $ "Cannot substitute '" ++ show x ++ "' in term '" ++ show (TmAbs y t1) ++ "'"
subst x (TmApp t1 t2) newVal = TmApp (subst x t1 newVal) (subst x t2 newVal)
But what is freeVars? Well, it’s the list of free variables in an expression. For example, in , we have that x is bound (because it appears in an argument in an abstraction), and y is free. TAPL already defined it as
FV in 5.3.2:
| Formula | Explanation |
| A single variable is free | |
| Free variables in a term within a lambda abstraction are all free variables of the term except the one bound by the abstraction | |
| Free variables of application is the union of the free variables of both terms |
Implementation in Haskell is also straight-forward with this:
freeVars :: Term -> [VarName] freeVars (TmVar x) = [x] freeVars (TmAbs x t1) = nub (freeVars t1) \\ [x] freeVars (TmApp t1 t2) = freeVars t1 ++ freeVars t2
Running a few examples:
Main> let lctest1 = (TmAbs "f" (TmAbs "x" (TmVar "x"))) Main> let lctest2 = (TmAbs "f" (TmAbs "x" (TmVar "y"))) Main> eval $ TmApp lctest1 lctest2 TmAbs "x" (TmVar "x") Main> eval $ TmApp lctest2 lctest1 TmAbs "x" (TmVar "y") Main> eval (TmApp (eval $ TmApp lctest2 lctest1) lctest1) TmVar "y"
Note, however:
Main> eval $ TmApp (TmAbs "x" (TmAbs "y" (TmVar "y"))) (TmAbs "x" (TmVar "y")) *** Exception: Cannot substitute '"x"' in term 'TmAbs "y" (TmVar "y")'
This happens by design, because we chose our system to work up to renaming of bound variables (the otherwise case in subst for abstraction). So, subst isn’t total. Church fixed this issue by using alpha conversion – which is just renaming bound variables. So for example could be re-written as
.
Now we can “fix” our evaluation error by alpha conversion:
Main> eval $ TmApp (TmAbs "x" (TmAbs "z" (TmVar "z"))) (TmAbs "x" (TmVar "y")) TmAbs "z" (TmVar "z")
There’s an alternative approach to fix this issue: De Bruijn indices.
2.3. Bonus: De Bruijn indices
The point with De Bruijn indices is that they remove the need of alpha conversion. They allow that by removing the variable argument in the lambda abstraction, and rely on integers (indices) to refer to variables. The index actually denotes the number of binders that are in scope between that occurrence and its corresponding binder.
For example, we have that because the distance from the x in the application to the abstraction is 0. Similar reasoning for
, etc.
We start with the following data definitions:
type VarNameB = Int
type Indices = (VarName, Int)
data TermB =
TmVarB VarNameB
| TmAbsB TermB
| TmAppB TermB TermB deriving (Eq, Show)
We will also create a function that will convert an old term Term to a De Bruijn’s term TermB:
bruijn :: Term -> Maybe TermB
bruijn e = go [] e where
go :: [Indices] -> Term -> TermB
go m (TmAbs x e) = TmAbsB $ let m' = map (\(x, y) -> (x, y + 1)) m in go ((x, 0) : m') e
go m (TmApp e e') = TmAppB (go m e) (go m e')
go m (TmVar x) = TmVarB $ findByFst m x
findByFst :: [Indices] -> VarName -> Int
findByFst ((x, y) : xs) x' = if x' == x then y else findByFst xs x'
findByFst _ _ = error "Variable not found"
Running it:
Main> bruijn (TmAbs "f" (TmAbs "x" (TmVar "x"))) TmAbsB (TmAbsB (TmVarB 0))
However:
Main> bruijn (TmAbs "f" (TmAbs "x" (TmVar "y"))) TmAbsB (TmAbsB (TmVarB *** Exception: Variable not found
This could be fixed by the function returning a Maybe TermB, or exposing go to consider the list of free variables and do the proper substitution. However, doing this or writing an evaluator for this system is left as an exercise to the reader 😀
2.4. Conclusion
The lambda calculus has a very minimal system, yet is a bit trickier to implement than the previous system we’ve seen. When we introduce function abstraction and application, we introduce complexity. We needed to account for contextual substitution.
An interesting question would be: is there a system that is simpler to implement, yet as powerful as lambda calculus? Maybe combinatory logic? But it’s not really pleasant to use as it’s not readable to us. It’s almost as what we essentially did is choose the minimal “most” readable system, and solve the side-effects (i.e. solving the issue of alpha conversion) of reading/parsing those systems that we humans have as a limitation. I mean, if it was easy for humans to parse SKIKSIKSKS easily, we wouldn’t use lambda calculus (which requires us as language designers to solve all the side-effects due to complexity), right?
In the next tutorial we’ll be implementing a type-checker for the lambda calculus – the typed lambda calculus.
Thank you for posting this.
It was quite useful as I was about to implement this myself as an educational exercise.
I am getting stuck trying to evaluate:
“`
TmApp (TmAbs (TmVar “x”) (TmVar “x”)) (TmVar “y”)
“`
In otherwords the id function applied to a variable y:
“`
(\x -> x) y
“`
which I think should simply return y, but the cases do not seem to cover this.
What am I missing?
E-App1: would simply result in an error, since the first parameter that is being evaluated is an TmAbs, not a TmApp.
E-App2: again results in an error, since the second parameter which is being evaluated is a TmVar, not a TmApp.
E-AppAbs: does not match.
In the book on page 72, it says E-App1 and E-App2 should not be applied, since t1 and t2 respectively can not take a step.
“but the premise further requires that t_1 can take a step”
I also cannot find the answer to this in the book, but maybe I am not searching well enough.
LikeLiked by 1 person
Sorry for the late reply. I believe it’s because the design is such that it expects Church numerals? For example, you could return 0 (\f x -> x):
“`
> eval $ TmApp (TmAbs “x” (TmVar “x”)) (TmAbs “f” (TmAbs “x” (TmVar “x”)))
TmAbs “f” (TmAbs “x” (TmVar “x”))
“`
Alternatively, you can edit the rules as follows to calculate that expression:
“`
eval :: Term -> Term
eval (TmApp (TmAbs x t12) v2) = subst x t12 v2 — E-AppAbs
eval (TmApp t1 t2) = let t1′ = eval t1 in TmApp t1′ t2 — E-App1
eval (TmVar x) = TmVar x
eval _ = error “No rule applies”
“`
LikeLike
“freeVars (TmAbs x t1) = freeVars t1 \\ [x]” does not work for the term Abs “a” (App (Var “a”) (Var “a”))
LikeLiked by 1 person
Thanks! Fixed by using `nub`.
LikeLike
Thanks for this, it is quite useful for a school project I’m working on.
A comment on the following sentence:
“Note how we pattern match against
v2@(TmAbs _ _)to detect a variable. We do that because we didn’t “treat” variables separately in our data definition.”I believe that in this sentence you intend to talk about a ‘value’ where you’ve written ‘variable’.
LikeLike