A Lisp compiler to ARM written in Lisp (2)


#1

About a year ago I published a simple experimental Lisp compiler, written in uLisp, that compiled a Lisp function into ARM machine code.

This is a development of that compiler, with the following additional features:

  • Local variables can be created with let and let*.
  • Loops are supported using loop, dotimes, and return.
  • In addition to if it now provides the conditional statements when and unless.
  • The functions first and rest can be used as synonyms of car and cdr.
  • The function nth can be used to return the nth item of a list.
  • The integer arithmetic functions / and mod are now supported.
  • Unary - and not are provided.
  • The unary predicates zerop, plusp, minusp, oddp, and evenp are supported.
  • The shift function ash is supported.
  • It takes advantage of ARM Thumb-2 instructions movw and movt to handle large immediate values, sdiv and mls to provide / and mod, and cbz, cbnz, and it to give more compact code.

Because it uses Thumb-2 instructions, this version of the compiler requires a board with an M4, M33, or later ARM processor. These include boards based on the ATSAMD51, RP2350, or nRF52840. Unlike the original version of the compiler it won’t work on boards with an M0 or M0+ ARM processor such as the ATSAMD21, RP2040, or the nRF51822.

The original article gave a series of example programs that you could compile with the compiler. In this article I give several additional examples that can be compiled by the additional features in this version of the compiler.

Introduction

When I added the facility of executing machine code to uLisp I had in mind the eventual goal of being able to compile uLisp functions into machine code, and this is a first step in that direction.

The nice thing about compiling Lisp is that you don’t have to write a tokeniser or parser, because Lisp programs are already in a consistent structure that can be processed by another Lisp program.

The compiler program is written in the subset of Common Lisp supported by uLisp, and will run on an ARM board with an M4, M33, or later ARM processor, and with at least 5000 objects of workspace. I used a Circuit Playground Bluefruit. You can also run it using Common Lisp on a laptop or desktop computer, and display the code it generates, but of course you won’t be able to run the ARM machine code because Common Lisp doesn’t have uLisp’s defcode command.

I got my initial inspiration for this compiler from Peter Norvig’s book “Paradigms of Artificial Intelligence Programming”.

Here’s the source of this version of the compiler: Lisp compiler for ARM 2.

To use the compiler you also need to load the ARM assembler from: ARM assembler in uLisp.

For more information about the assembler see ARM assembler overview.

Using the compiler

To use this compiler you simply call compile on a Lisp function; for example:

(compile 'factor)

The function will be compiled into a machine code function, replacing the original Lisp code, so that calling factor will now execute the ARM machine-code version.

You can also display the code generated for an expression by calling comp on the expression; for example:

(pprint (comp '(* 13 17)))

(:integer
  ($mov 'r0 13)
  ($push '(r0))
  ($mov 'r0 17)
  ($pop '(r1))
  ($mul 'r0 'r1))

I give examples of several simple Lisp programs that it will successfully compile later in this article, together with a comparison of the speed of the Lisp and machine-code versions.

Before compiling a new function you might want to remove the previous one from memory using makunbound to free up the code memory before compiling the next function; for example:

(makunbound 'factor)

Alternatively you could increase the amount of memory available for machine code by editing the directive such as:

#define CODESIZE 256 

before uploading uLisp to your board.

How the compiler works

Register usage

To avoid needing to keep track of register usage the compiler makes use of the stack to pass values to an expression, and store the value returned by an expression.

The following table shows how the ARM registers are used within the compiler:

Registers Use
r0 r1 r2 r3 Used to pass the parameters to the main function’s arguments.
r0 Contains the value returned by the main function.
r4 r5 r6 r7 Contain copies of the function arguments within the function.
r0 r1 Used to pass the arguments to each operator.
r0 Used to return the value from each operator.
r2 Used to return the true/nil value from comparisons.

Compiling an expression

The following steps show the sequence of compiling an expression, such as:

(* x 13)
  • Code is generated to evaluate each of the arguments, in this case x and 13, and each result is pushed onto the stack, apart from the last which is left in r0.
  • The first value is popped from the stack into register r1.
  • The function, in this case *, is then evaluated for r1 and r0, with the result in r0.

This stack-based approach ensures that a more complex expression, such as:

(* (- x 1) (+ x 13))

will also compile into correct code, without conflicts between registers.

Calling the function recursively

The compiler supports calling a function recursively from within the function itself. Because the registers corresponding to the parameters and local variables would be overwritten by the recursive call they are stored on the stack around the function call.

There are several recursive functions in the examples below.

Types

For boolean operations I decided to represent nil as zero, and t as 1. A problem I hadn’t anticipated was that I would need to keep track of what type of object each function returned, integer or boolean. For example, consider the problem of compiling the statement:

(and x y)

If x has the value 0 and y has the value 7 this should return 7. However, if x has the value nil and y has the value 7 this should return nil. Representing nil as zero leads to an ambiguity.

I solved this by returning a type, :integer or :boolean, with each compiled expression, according to the following rules:

  • Predicates, and t or nil, always return a :boolean.
  • Arithmetic operations always return an :integer.
  • An if form requires a :boolean test form and returns an :integer.
  • A progn or let block returns the type of its last expression.

An item with an ambiguous type returns the type nil.

The compiler gives a warning if a function seems to use incorrect types.

Examples

I used the following simple examples to test the new features in this version of the compiler. The compiler will also compile all the examples in the earlier article: A Lisp compiler to ARM written in Lisp.

To compile a Lisp function you simply give the command compile followed by the name of the function; for example, to compile the factor function:

(compile 'factor)

Factor

This function takes a simple approach to finding the least prime factor of a number by testing candidate factors up to the square root of the number. It uses the new features of the compiler: let, loop, return, when, and mod:

(defun factor (n)
  (let ((d 2) (i 1))
    (loop
     (when (> (* d d) n) (return n))
     (when (zerop (mod n d)) (return d))
     (setq d (+ d i)) (setq i 2))))

Lisp version:

> (time (factor 2147302777))
46327
Time: 5.3 s

Compiled version:

> (time (factor 2147302777))
46327
Time: 24 ms

Subset Sum problem

This is an NP-hard problem that can only be solved by an exhaustive search (see Subset sum problem on Wikipedia). The problem is as follows: given a list of integers and a target total, find whether the target can be reached by summing a subset of the integers.

The following recursive Lisp program solves the problem (see Recursive solution to Subset Sum on Lispology). It uses the new features of the compiler: nth and zerop :

(defun subsetsum-p (lis n sum)
  (if (zerop sum) t
    (if (zerop n) nil
      (if (> (nth (1- n) lis) sum) (subsetsum-p lis (1- n) sum)
        (or (subsetsum-p lis (1- n) sum)
          (subsetsum-p lis (1- n) (- sum (nth (1- n) lis))))))))

To test the function I used this list:

(defvar *lis* '(2 4 6 8 10 12 14 16 18 20 22 24 26 28 30))

Lisp version:

> (time (subsetsum-p *lis* (length *lis*) 200))
t
Time: 10.8 s

> (time (subsetsum-p *lis* (length *lis*) 199))
nil
Time: 21.5 s

Here I’ve contrived the list to only contain even numbers, so we can be sure that 199 will be unreachable.

Compiled version

The compiled version returns the numbers 0 for nil and 1 for t:

> (time (subsetsum-p *lis* (length *lis*) 200))
1
Time: 93 ms

> (time (subsetsum-p *lis* (length *lis*) 199))
0
Time: 93 ms

Exponentiation

The following function calculates x^y. The result must be a 32-bit integer. It uses the following new features: let, loop, return, zerop, oddp, and ash:

(defun iex (x y)
  (let ((e 1))
    (loop
     (when (zerop y) (return e))
     (when (oddp y) (setq e (* e x)))
     (setq x (* x x))
     (setq y (ash y -1)))))

Lisp version:

> (time (iex 7 11))
1977326743
Time: 1 ms

Compiled version

> (time (iex 7 11))
1977326743
Time: 0 ms

Reverse digits

This function reverses the digits of an integer:

(defun reversedigits (n)
  (let ((a 0))
    (loop
     (setq a (+ (* 10 a) (mod n 10)))
     (setq n (truncate n 10))
     (when (zerop n) (return a)))))

It uses the new features: let, loop, return, zerop, truncate, and mod:

Lisp version:

> (time (reversedigits 123456789))
987654321
Time: 2 ms

Compiled version

> (time (reversedigits 123456789))
987654321
Time: 0 ms

Compiler source

Here’s the source of this version of the compiler: Lisp compiler for ARM 2.

To use the compiler you also need to load the ARM assembler from: ARM assembler in uLisp.

The Thumb-2 extensions are included with the compiler.