Towards Hoare logic for a small imperative language in Haskell

Recently I finished Logical Foundations, and I have to admit it was an excellent read. This post is directly inspired by the Simple Imperative Programs section of the same book.

What is an imperative programming language? Well, nowadays they run the world. C, Python, JavaScript, PHP – you name it – we all write code in them using imperative style; that is, we command the computer and tell it what to do and how to do it. The mathematical language is very unlike this – it is more declarative rather than imperative – it doesn’t care about the how.

I’ve always been fascinated by mathematical proofs, especially the mechanical part around them because computers are directly involved in this case. Hoare logic is one of those systems that allows us to mechanically reason about computer programs.

We will implement a small imperative language together with (a very simple) Hoare Logic that will allow us to reason about programs in this imperative language.

1. Type level vs value level (intro)

If you haven’t gone through Logical Foundations, I strongly suggest you do. It’s programming at a different level – even if you know some Haskell already (and if you don’t know Haskell, you should learn it, too). If Logical Foundations seems too much, you can start with my book first, which is a gentler introduction.

If you’ve gone through Logical Foundations, congratulations! In any case, there’s still something to learn from this blog post. In Logical Foundations, we use the dependently typed programming language Coq. There are some advantages and disadvantages in this case. The obvious disadvantage is that all programs must be total and must terminate. As the authors from Logical Foundations say:

In a traditional functional programming language like OCaml or Haskell we could add the while case as follows:

Coq doesn’t accept such a definition (“Error: Cannot guess decreasing argument of fix”) because the function we want to define is not guaranteed to terminate.

The advantage is that we can mathematically reason about all such programs.

Implementing an imperative language in Haskell is more concerned about playing at the value level (and to some extent at the type level), whereas in Coq, we’re completely playing at the type level.

2. Small arithmetic language

Let’s start by implementing a very small arithmetic language: one that can do addition, subtraction and multiplication:

data Aexp =
  ANum Integer
  | APlus Aexp Aexp
  | AMinus Aexp Aexp
  | AMult Aexp Aexp
  deriving (Show)

aeval :: Aexp -> Integer
aeval (ANum n)       = n
aeval (APlus a1 a2)  = aeval a1 + aeval a2
aeval (AMinus a1 a2) = aeval a1 - aeval a2
aeval (AMult a1 a2)  = aeval a1 * aeval a2

That was easy. A simple evaluation:

> aeval (APlus (ANum 1) (ANum 3))
4

3. Small boolean language

While the previous language dealt with numbers, we now want a language that will deal with boolean expressions.

data Bexp =
  BTrue
  | BFalse
  | BEq Aexp Aexp
  | BLe Aexp Aexp
  | BNot Bexp
  | BAnd Bexp Bexp
  deriving (Show)

beval :: Bexp -> Bool
beval BTrue        = True
beval BFalse       = False
beval (BEq a1 a2)  = aeval a1 == aeval a2
beval (BLe a1 a2)  = aeval a1 <= aeval a2
beval (BNot b1)    = not (beval b1)
beval (BAnd b1 b2) = beval b1 && beval b2

Note that this language depends on aeval for some calculations, but it’s still its own language.

> beval $ BEq (APlus (ANum 1) (ANum 0)) (APlus (ANum 0) (ANum 1))
True

Perfect.

4. Small imperative language

We now turn to the main point, and start implementing a small imperative language.

data Command =
  CSkip -- NOP
  | CAss Char Aexp -- X := Y
  | CSeq Command Command -- A; B
  | CIfElse Bexp Command Command -- If A then B else C
  | CWhile Bexp Command -- While A { B }
  deriving (Show)

Note that in the CAss (assignment) case, we use chars to denote variables ('A', 'B', …). So our language supports expressions like X := 3 (which is CAss 'X' (ANum 3)). Great!

Hmm, what about X := Y? It seems that we need to bring variables at the arithmetic level.

4.1. Small arithmetic language’

We modify our arithmetic language to supports variables.

data Aexp =
  ANum Integer
  | AId Char -- This is new.
  | APlus Aexp Aexp
  | AMinus Aexp Aexp
  | AMult Aexp Aexp
  deriving (Show)

Now for the eval function:

aeval :: Aexp -> Integer
aeval (AId v)        = ???
aeval (ANum n)       = n
aeval (APlus a1 a2)  = aeval a1 + aeval a2
aeval (AMinus a1 a2) = aeval a1 - aeval a2
aeval (AMult a1 a2)  = aeval a1 * aeval a2

Whoops. We’re stuck for the AId case. Where do we read this variable from? We turn to contexts (or environments). A context is just a map of chars (variables) to integers (results in our arithmetic language).

import qualified Data.Map as M

type Context = M.Map Char Integer

aeval :: Context -> Aexp -> Integer
aeval ctx (AId v)        = ctx M.! v -- careful: element may not exist
aeval ctx (ANum n)       = n
aeval ctx (APlus a1 a2)  = aeval ctx a1 + aeval ctx a2
aeval ctx (AMinus a1 a2) = aeval ctx a1 - aeval ctx a2
aeval ctx (AMult a1 a2)  = aeval ctx a1 * aeval ctx a2

A few simple evaluations:

> aeval (M.fromList [('X', 3)]) (AId 'X')
3
> aeval (M.fromList [('X', 3)]) (APlus (AId 'X') (ANum 1))
4

4.2. Small boolean language’

Recall that the boolean language was dependent on the arithmetic language, so naturally, the changes in the arithmetic language would impact the boolean language as well. It’s just a matter of passing contexts around.

beval :: Context -> Bexp -> Bool
beval ctx BTrue        = True
beval ctx BFalse       = False
beval ctx (BEq a1 a2)  = aeval ctx a1 == aeval ctx a2
beval ctx (BLe a1 a2)  = aeval ctx a1 <= aeval ctx a2
beval ctx (BNot b1)    = not (beval ctx b1)
beval ctx (BAnd b1 b2) = beval ctx b1 && beval ctx b2

Neat.

4.3. Back to the imperative language

It looks like we have everything we need now. So we turn to the evaluation function:

eval :: Context -> Command -> Context
eval ctx CSkip             = ctx
eval ctx (CAss c v)        = M.insert c (aeval ctx v) ctx
eval ctx (CSeq c1 c2)      = let ctx' = eval ctx c1 in eval ctx' c2
eval ctx (CIfElse b c1 c2) = eval ctx $ if beval ctx b then c1 else c2
eval ctx (CWhile b c)      = if beval ctx b
                             then let ctx' = eval ctx c in eval ctx' (CWhile b c)
                             else ctx

The evaluation function basically accepts a context (or state, or environment), a command, and then returns a modified context (or state, or environment).

  • CSkip is like the NOP (no operation) of imperative programs. Think of it as the empty statement.
  • CAss will insert a variable to the context.
  • CSeq will join two commands and evaluate them both (this allows us to evaluate commands in sequence).
  • CIfElse accepts a boolean, and depending on its value either executes one command, or another
  • CWhile keeps executing a command as long as the boolean is true

Let’s now try to represent factorial in our language. In pseudo-code, it would be:

Z := X
Y := 1
while (~Z = 0)
  Y := Y * Z
  Z := Z - 1

This program will store the factorial of X in Y, that is, Y = X!.

And here’s the same algorithm represented in our imperative programming language:

fact_X =
  let l1 = CAss 'Z' (AId 'X')
      l2 = CAss 'Y' (ANum 1)
      l3 = CWhile (BNot (BEq (AId 'Z') (ANum 0))) (CSeq l4 l5)
      l4 = CAss 'Y' (AMult (AId 'Y') (AId 'Z'))
      l5 = CAss 'Z' (AMinus (AId 'Z') (ANum 1))
  in CSeq l1 (CSeq l2 l3)

Fancy.

> eval (M.fromList [('X', 5)]) fact_X
fromList [('X',5),('Y',120),('Z',0)]

It calculated that 5! = 120. Looks good to me. So much for the imperative language.

5. Assertions (Run-time)

We now want a way to verify properties about our program. There are two ways we can do this:

  • At the meta level: Very easy, we just implement a function in Haskell
  • At the object level (inside the imperative language): A bit trickier, but still easy

5.1. Meta level

Here’s the straight-forward implementation:

assert :: Context -> Bexp -> Command -> Bexp -> Bool
assert ctx boolPre cmd boolPost =
  beval ctx boolPre &&
  beval (eval ctx cmd) boolPost

Together with an example proof that 5! = 120:

fact_Proof = assert
  (M.fromList [('X', 5)])    -- Initial context
  (BEq (ANum 5) (AId 'X'))   -- Before: X == 5
  fact_X
  (BEq (ANum 120) (AId 'Y')) -- After: Y == 120 (Y = X!)

Trying it out:

> fact_Proof
True

It proved it!

5.2. Object level

This change will affect our programming language in several ways. We have to do a bunch of changes here and there.

First, let’s import Data.Either – we’ll see in a moment why we need it:

import Data.Either

Next, let’s slightly modify our Command:

data Command =
...
  | CAssert Bexp Command Bexp
...

The new eval' function will not always return a context. It can now return an error as well (String) – this will happen in the case when the assertion was not fulfilled at runtime. Here’s the new function:

eval' :: Context -> Command -> Either String Context
eval' ctx CSkip             = Right ctx
eval' ctx (CAss c v)        = Right $ M.insert c (aeval ctx v) ctx
eval' ctx (CSeq c1 c2)      = let ctx' = eval' ctx c1 in whenRight ctx' (\ctx'' -> eval' ctx'' c2)
eval' ctx (CIfElse b c1 c2) = eval' ctx $ if beval ctx b then c1 else c2
eval' ctx (CWhile b c)      = if beval ctx b
                             then let ctx' = eval' ctx c in whenRight ctx' (\ctx'' -> eval' ctx'' (CWhile b c))
                             else Right ctx

whenRight :: Either a t -> (t -> Either a b) -> Either a b
whenRight (Right x) f = f x
whenRight (Left x)  _ = Left x

whenLeft :: Either t b -> (t -> Either a b) -> Either a b
whenLeft (Left x) f   = f x
whenLeft (Right x)  _ = Right x

Nothing new going on, we just added handling to propagate the error in case it happens. The real deal is in handling the case for the CAssert command:

eval' ctx (CAssert b1 c b2) =
  if beval ctx b1
  then whenRight (eval' ctx c)
       (\ctx'' -> if beval ctx'' b2
                  then Right ctx''
                  else Left "Post-condition does not match!")
  else Left "Pre-condition does not match!"

Few evaluations:

> eval' (M.fromList [('X', 0)]) (CAssert (BEq (AId 'X') (ANum 0)) CSkip (BEq (AId 'X') (ANum 0)))
Right (fromList [('X',0)])
> eval' (M.fromList [('X', 0)]) (CAssert (BEq (AId 'X') (ANum 1)) CSkip (BEq (AId 'X') (ANum 0)))
Left "Pre-condition does not match!"
> eval' (M.fromList [('X', 0)]) (CAssert (BEq (AId 'X') (ANum 0)) CSkip (BEq (AId 'X') (ANum 1)))
Left "Post-condition does not match!"

6. Towards Hoare logic

In the previous chapter, we implemented assertions at the run-time level. The biggest disadvantage of that is we have to do a full evaluation in order to conclude some proposition. For example, consider fact_Proof – it had to actually evaluate the factorial to conclude that X = 120.

You know how some languages like Python don’t have a compile step? Well, our implementation is kind of equivalent to that. But some languages do have a compile step, like C or Haskell. And this compilation step can be very useful. It can do a lot of checks, e.g. type checks (mathematical proofs!). That’s what we’ll try to do here – implement a “compile”-time check.

Compile-time, run-time, whatever-time is all about having evaluations at different levels. There is still computation going on, but e.g. the computation strategies at the compile-time level may be different from those at the run-time level.

For example, evaluating Command can be expensive, and sometimes even not terminate (our language is _not_ strongly normalizing); consider the evaluation of CWhile BTrue CSkip. But, still, we want to deduce propositions without going through all evaluations. e.g. deduce something about 5! without evaluating 1 \cdot 2 \cdot 3 \cdot 4 \cdot 5.

The way to achieve this, just as a proof of concept, is to implement a subset of Hoare Logic. This sounds easy but is actually quite complicated. To start with, I found some useful slides. The mathematical formulas for Hoare logic can be found there, and even though they look simple, implementing them is a completely different thing. The implementation details cover stuff like “what expressions do we want to support”, “are we working with a strongly normalizing language”, “what’s the language that will represent propositions”, etc. while mathematics does not care about these deals.

6.1. Hoare triples

What’s a Hoare triple? It’s a command, having a pre-condition (Bexp) and a post-condition (Bexp). In this case, we decided that pre/post-conditions will be in the language of Bexp, but they can also be in a different language (regardless of that Bexp is used in Command).

data HoareTriple =
  HoareTriple Bexp Command Bexp
  deriving (Show)

6.2. Hoare assignment

We’ll start by implementing a small subset of Hoare assignment. Given an assignment command V := E, it will produce the triple where the precondition is Q[E/V] and the postcondition is Q, for any Q (p. 15 of the slides).

hoareAssignment :: Command -> Bexp -> Maybe HoareTriple
hoareAssignment (CAss v e) q = Just $ HoareTriple (substAssignment q e v) (CAss v e) q
hoareAssginment _ _ = Nothing

What about substAssignment? This is another case where we have to make a design decision – how far do we want to go. There are several ways:

  • One way to do it is to do full-blown Bexp/Aexp evaluation (since those are strongly normalizing, compared to Command) but this evaluation can still take some time.
  • Another way is to specify a language (a subset of Aexp/Bexp) that only has some certain properties and is faster to evaluate.
  • The third way is to specify some concrete set of “optimizations” (or mathematical rewrites) that we will support, based on our original language (Aexp/Bexp).

We’ll go with the third way.

substAssignment :: Bexp -> Aexp -> Char -> Bexp
substAssignment q@(BEq (AId x) y) e v
  | x == v    = BEq e y
  | otherwise = q
substAssignment q@(BEq x (AId y)) e v
  | y == v    = BEq e (AId y)
  | otherwise = q
substAssignment q _ _ = q

In this case, we just implement replacing when either the expression on the left or on the right is a variable. Our program can now “prove” some real cool properties, such as:

> hoareAssignment (CAss 'X' (ANum 3)) (BEq (AId 'X') (ANum 3))
Just (HoareTriple (BEq (ANum 3) (ANum 3)) (CAss 'X' (ANum 3)) (BEq (AId 'X') (ANum 3)))

That is, given the pre-condition that 3 = 3, the assignment X := 3 implies that X = 3. Note how we don’t check the validity of the pre-condition (since that would require evaluating Aexp/Bexp). We just say that assuming this pre-condition, this command produces that post-condition. Pretty cool!

6.3. Hoare skip

We can also implement hoareSkip as follows:

hoareSkip :: Command -> Bexp -> Maybe HoareTriple
hoareSkip (CSkip) q = Just $ HoareTriple q CSkip q
hoareSkip _ _ = Nothing

Now, in addition, the following Hoare triple can be produced:

> hoareSkip CSkip (BEq (AId 'X') (ANum 3))
Just (HoareTriple (BEq (AId 'X') (ANum 3)) CSkip (BEq (AId 'X') (ANum 3)))

6.4. Hoare sequence

Finally, we will implement the Hoare sequence rule.

Start by adding deriving Eq to both Aexp and Bexp – we will need this to be able to compare expressions from either of those languages.

hoareSequence :: HoareTriple -> HoareTriple -> Maybe HoareTriple
hoareSequence (HoareTriple p c1 q1) (HoareTriple q2 c2 r)
  | q1 == q2  = Just $ HoareTriple p (CSeq c1 c2) r
  | otherwise = Nothing

We can now sequence our programs as follows:

> import Data.Maybe (fromJust)
> let first = hoareAssignment (CAss 'X' (ANum 3)) (BEq (AId 'X') (ANum 3)) in let second = hoareSkip CSkip (BEq (AId 'X') (ANum 3)) in hoareSequence (fromJust first) (fromJust second)
Just (HoareTriple (BEq (ANum 3) (ANum 3)) (CSeq (CAss 'X' (ANum 3)) CSkip) (BEq (AId 'X') (ANum 3)))

Which is awesome!

Conclusion

The final code can be found at https://github.com/bor0/hoare-imp/.

This was a cool small research project that I did. I always enjoy tackling languages, mathematical proofs, and programs. The beautiful thing is that Hoare’s logic combines all of these. And tackling them is fun because they lie at the heart of Computer Science.

One thought on “Towards Hoare logic for a small imperative language in Haskell

Leave a comment