One thing that was always interesting to me about Lisps in general is their minimality, which means that with a couple of starting points you can implement almost anything in the language, even parts of the language itself.
SICP in chapter 2 asks us to re-implement car, cons, and cdr (related post).
That’s a neat way to represent structured data.
Now, for something more interesting, how about we re-implement the struct syntax, by providing our own create-struct macro that will generate a data type with fields, and procedures for retrieving them?
For example, if we did the following:
(create-struct person (firstname lastname))
This macro should generate these procedures for us:
person– to construct an object of such type, which accepts 2 parameters in this caseperson?– to check if a variable is really of that struct type, which accepts 1 parameterperson-firstname– to retrieve the first field of a person object, which accepts 1 parameterperson-lastname– to retrieve the second field of a person object, which accepts 1 parameter
That is, the code above should generate something similar to this:
(define (person firstname lastname) (list 'person firstname lastname)) (define (person? p) (and (list? p) (>= (length p) 1) (equal? (car p) 'person))) (define (person-firstname p) (list-ref p 1)) (define (person-lastname p) (list-ref p 2))
So that when we evaluate it, we get:
> (person-firstname (person "Hello" "World")) "Hello" > (person-lastname (person "Hello" "World")) "World" > (person? (person "Hello" "World")) #t
Sounds fun?
We will use non-hygienic macros just for fun, since I find them easier to write code with and understand them.
So, our code starts with:
#lang racket ; Non-hygienic macros support. (require mzlib/defmacro)
Let’s start by just implementing our constructor person initially:
(define-macro (create-struct struct-name struct-fields)
(list 'begin
(list 'define struct-name
(list 'lambda struct-fields (cons 'list (cons (list 'quote struct-name) struct-fields))))))
Testing it:
> (person "Hello" "World") '(person "Hello" "World")
Seems to be working fine.
To get a better grasp for the code above, what I usually do is change it to a function, and call it to see what it returns. But testing it this way can be tricky as it won’t have the variables in context, so you will have to quote allthethings:
#lang racket
(require mzlib/defmacro)
(define (create-struct struct-name struct-fields)
(list 'begin
(list 'define struct-name
(list 'lambda struct-fields (cons 'list (cons (list 'quote struct-name) struct-fields))))))
(create-struct 'person '(firstname lastname))
; returns '(begin (define person (lambda (firstname lastname) (list 'person firstname lastname))))
Looking good.
Now, let’s proceed by implementing person?:
(define-macro (create-struct struct-name struct-fields)
(append
(list 'begin
(list 'define struct-name
(list 'lambda struct-fields (cons 'list (cons (list 'quote struct-name) struct-fields))))
(list 'define (string->symbol (string-append (symbol->string struct-name) "?"))
(list 'lambda '(x) (list 'and (list 'list? 'x) (list '>= (list 'length 'x) 1) (list 'equal? (list 'car 'x) (list 'quote struct-name))))))))
Testing it:
> (person? (person "Hello" "World")) #t > (person? '(1 2 3)) #f > (person? '(person 2 3)) #t
We will have to re-use the string->symbol magic we did, so it’s good to even have that as a macro:
(define-macro (symbol-append . x) (list 'string->symbol (list 'apply 'string-append (list 'map 'symbol->string (cons 'list x))))) (symbol-append 'a 'b 'c) ; returns 'abc
Looks good. Now, let’s try to use it in our macro by changing that line to:
(list 'define (symbol-append struct-name '?)
We get the following:
. symbol-append: unbound identifier in module (in the transformer environment, which does not include the macro definition that is visible to run-time expressions) in: symbol-append
After some searching through the docs, I found out that it’s related to compilation phases. Note that in REPL mode it would not make any difference.
In order to define a function that we can use in our macro, we can use define-for-syntax. It acts just like define, except that the binding is at phase level 1 instead of phase level 0.
(define-for-syntax (symbol-append . x) (string->symbol (apply string-append (map symbol->string x))))
Now our code works. Phew!
To finalize our macro, we also have to implement the getters. We could try with the following code:
(define struct-name 'test)
(define struct-fields '(a b c))
(map (lambda (field index)
(list 'define
(symbol-append struct-name '- field)
(list 'lambda (list 'ctx) (list 'list-ref 'ctx index))))
struct-fields
(range 1 (+ 1 (length struct-fields))))
What this code does is it will map through all the struct-fields, create a definition for them named by using our symbol-append helper concatenated with their field name, and then the body of that function should just use list-ref. Note how we added another field to our map (index) so that we know what to list-ref to.
Running the code above produces:
'((define test-a (lambda (ctx) (list-ref ctx 1))) (define test-b (lambda (ctx) (list-ref ctx 2))) (define test-c (lambda (ctx) (list-ref ctx 3))))
Looks good. Now, let’s try to merge the code above in our macro, so that the final definition is:
(define-macro (create-struct struct-name struct-fields)
(append
(list 'begin
(list 'define struct-name
(list 'lambda struct-fields (cons 'list (cons (list 'quote struct-name) struct-fields))))
(list 'define (symbol-append struct-name '?)
(list 'lambda '(x) (list 'and (list 'list? 'x) (list '>= (list 'length 'x) 1) (list 'equal? (list 'car 'x) (list 'quote struct-name))))))
(map (lambda (field index)
(list 'define
(symbol-append struct-name '- field)
(list 'lambda (list 'ctx) (list 'list-ref 'ctx index))))
struct-fields
(range 1 (+ 1 (length struct-fields))))))
Whoops, again a compilation phases error:
range: unbound identifier in module (in phase 1, transformer environment) in: range
By visiting the same docs as above, we just need to do the following:
; We need to do this so that the bindings for `range` are provided in context for the macro. (require (for-syntax racket))
And that concludes our macro. Let’s give it a few tries now:
(create-struct person (firstname lastname)) (define test-p1 (person "Test" "Person")) (person-firstname test-p1) (person-lastname test-p1) (person? test-p1) (struct person-2 (firstname lastname)) (define test-p2 (person-2 "Test" "Person")) (person-2-firstname test-p2) (person-2-lastname test-p2) (person-2? test-p2) #| Produces: "Test" "Person" #t "Test" "Person" #t |#
This post re-implements the struct keyword just for fun. It also demonstrates how tricky compilation and runtime phases can be.
Bonus: Serialization works pretty well with a structure like this. If you try to serialize a struct you will see it binds the data to the current executing module:
#lang racket
(require racket/serialize)
(serializable-struct person (firstname lastname) #:transparent)
(serialize (person "Hello" "World"))
; Produces the following:
; '((3) 1 (('anonymous-module . deserialize-info:person-v0)) 0 () () (0 "Hello" "World"))
There is a good reason for the behaviour above, since a fish structure used by one programmer in one module is likely to be different from a fish structure used by another programmer.
However, you can pass #:prefab with a struct to achieve similar functionality with our struct macro above, that is, generate a “global” serialized value:
#lang racket (require racket/serialize) (struct person (firstname lastname) #:prefab) (serialize (person "Hello" "World")) ; Produces the following: ; '((3) 0 () 0 () () (f person "Hello" "World"))