Recursion from first principles

Recursion is one of the things that makes computation happen – whether you’re doing something on your computer, smart TV, or smartphone.

For example, here’s a definition of the addition function represented in first-order logic:

$\forall y, add(0, y, y) \\ \forall x, y, z, add(x, y, z) \to add(S(x), y, S(z))$

Or, the more commonly known variant:

$\begin{aligned} 0+y &= y \\ S(x)+y &= S(x+y) \end{aligned}$

In this blog post, we will generalize recursive functions.

Primitive recursion

Note that primitive recursion can be defined within first-order logic using Peano axioms. All Peano axioms except the induction axiom are statements in first-order logic. This relation to computation is one of the things that makes number theory interesting to study.

In order to represent primitive recursive functions, we need the following axioms for basic primitive functions:

The constant function $C_{n}^{k}(x_{1},\ldots ,x_{k}) = n$ of type $\mathbb{N}^k \to \mathbb{N}$
The successor function $S(x) = x + 1$ of type $\mathbb{N} \to \mathbb{N}$
The projection function $P_{i}^{k}(x_{1},\ldots ,x_{k}) = x_i$ of type $\mathbb{N}^k \to \mathbb{N}$

The first and the third functions are generalized to $k$ parameters. Here’s a few examples for each:

$C_4^1(x) = C(x) = 4$ . The constant function returns $4$ for any $x$
$S(5) = 5 + 1 = 6$ . The successor function returns the next number
$P_{i}^3(x_1, x_2, x_3) = x_i$ . The projection function can return one of $x_1$ , $x_2$ , or $x_3$ . E.g. $P_1^3(x_1, x_2, x_3) = x_1$

The constant function represents the base case of the recursion and the successor function to increase the index loop. The projection function allows the extraction of a variable from a sequence.

In addition, to generate more complex primitive functions, we can use the following two axioms:

Composition: $f(x_1, \ldots, x_k) = h(g_1(x_1, \ldots, x_k), \ldots, g_m(x_1, \ldots, x_k))$ , where $f : \mathbb{N}^k \to \mathbb{N}$ and the functions $g_m : \mathbb{N}^k \to \mathbb{N}$ and $h : \mathbb{N}^m \to \mathbb{N}$ are constructed by using the previous rules (constant, successor, projection)
The recursive operator $\rho$ , defined as $\rho(g, h) = f$ , where $f : \mathbb{N}^{k+1} \to \mathbb{N}$ , $g : \mathbb{N}^k \to \mathbb{N}$ and $h : \mathbb{N}^{k+2} \to \mathbb{N}$ with $\begin{aligned} f(0,x_{1},\ldots ,x_{k})&=g(x_{1},\ldots ,x_{k}) \\ f(S(y),x_{1},\ldots ,x_{k})&=h(y,f(y,x_{1},\ldots ,x_{k}),x_{1},\ldots ,x_{k}) \end{aligned}$

$f$ represents a for-loop, starting from 0 and proceeding upwards to its first argument. The remaining arguments represent the state of the computation. $g$ describes the initial calculations, and $h$ describes the iterative calculations.

Applying these axioms a finite number of times allows us to generate all primitive recursive functions.

Going back to the addition example, we can now represent it as $Add = \rho(P_1^1, S \circ P^3_2)$ ; we choose $g = P^1_1$ to capture $Add(0, y) = g(y) = y$ and $h = S \circ P^3_2$ to capture $Add(S(x),y)=h(x,Add(x,y),y)=(S\circ P_{2}^{3})(x,Add(x,y),y)=S(Add(x,y))$ .

Example evaluation:

$\begin{aligned} Add(2, 2) =& \rho (P_{1}^{1},S\circ P_{2}^{3})(S(1), S(1)) \\ \text{iteration}=& f(S(1), S(1)) = h(1, f(1, S(1)), S(1)) = (S \circ P^3_2) (1, f(1, S(1)), S(1)) \\ \text{iteration}=& S(f(1, S(1))) = S(h(0, f(0, S(1)), S(1))) = S((S \circ P^3_2)(0, f(0, S(1)), S(1))) \\ \text{base}=& S(S(f(0, S(1)))) = S(S(g(S(1)))) = S(S(P^1_1(S(1)))) = S(S(S(1))) = 4 \\ \end{aligned}$

A direct implementation of the rules in Haskell:

constant n xs = [n]

successor [x] = [x + 1]

projection i xs = [xs !! (i - 1)]

compose h gs xs = f xs h gs
  where
  f xs h gs = h $ concat $ map (\g -> g xs) gs

rho g h xs = f xs g h
  where
  f (0:xs) g h = g xs
  f (n:xs) g h = h ([n - 1] ++ f ((n - 1) : xs) g h ++ xs)

Example evaluations:

> add = rho (projection 1) (compose successor [projection 2])
> add [5,5]
[10]
> double = compose add [projection 1, projection 1]
> double [20]
[40]
> mult = rho (constant 0) (compose add [projection 2, projection 3])
> mult [2,5]
[10]

General recursion

If in primitive recursive functions the loops are always bounded, then general recursive functions generalize them and allow for unbounded loops.

One additional axiom is required for that:

The minimization operator $\mu (f)(x_{1},\ldots ,x_{k})=z$ is defined as $f(i, x_{1}, \ldots, x_{k}) > 0$ for $i = 0,\ldots ,z-1$ , and $f(z,x_{1},\ldots ,x_{k})=0$ otherwise, for some function $f : \mathbb{N}^{k+1} \to \mathbb{N}$

Minimization starts searching from 0 and proceeds upwards to find the minimal argument that causes the function to return zero. If there is no such argument, or if $f$ is undefined, the search never terminates.

With the minimization operator, the system becomes powerful enough to simulate a Turing machine.

A direct implementation in Haskell:

mu f xs = go 0 f xs
  where
  go n f xs = if f (n:xs) == 0 then n else go (n+1) f xs

Conclusion

There are many ways to represent computation, and recursive functions are just one way to do it. Other popular ways are Turing machines* and lambda calculus**. They are shown to be equivalent in terms of what can be computed. In fact, we represented recursive functions within the system of lambda calculus – Haskell. The Church-Turing conjecture states that essentially all computation can be calculated by a Turing machine – note that this is not a theorem but rather a conjecture.

We showed how “all” computation can be captured by just a few axioms: the constant function, the successor function, the projection function, function composition, the recursion operator, and the minimization operator.

Recently, I’ve been busy working on Budge, an esoteric programming language. If we look at the paper, we can notice that the function for evaluating a Budge program $E$ is similar to the recursive operator $\rho$ (in that a code is allowed to be repeated based on some condition), and the minimization operator $\mu$ (a code can be repeated an infinite number of times).

*: A Turing machine $T$ consists of a finite set of states $Q$ , a finite set of symbols $\Sigma$ , an initial state $s \in Q$ , and a function $\delta : (Q \times \Sigma) \rightarrow (\Sigma \times \{L,R\} \times Q)$ that defines the next state.

**: Lambda calculus expressions can be $x$ (vars), $(\lambda x.M)$ (abstraction) and $(M N)$ (application). The evaluation rule $((\lambda x.M)\ N)\rightarrow (M[x:=N])$ where $M[x:=N]$ replaces all free vars (vars in the abstraction body not bound by its params) $x$ with $N$ in $M$ .