Fixnums and other immediate values


#1

uLisp uses tagged unions as its exclusive value representation. Is there a particular reason this is the case? There are fairly simple optimisations that can be achieved from there.

As a proof of concept, I’ve hacked 13-bit fixnums into the AVR version. I did this by also moving the type identifiers around a bit, so that their bit patterns won’t intersect with either the fixnums or valid pointers. I’ve gotten this to mostly work, but there is an assumption in the rest of the code that any object pointer can be dereferenced, and I’m not sure I’ve tracked all those instances down.

In any case, this means that for numbers between -4096 and 4095, uLisp doesn’t have to allocate from the workspace for the number itself, but instead allocates the cons cell and stuffs the number right in the car. Platforms with larger pointers get more bits to play with, so the ARM version of this, for instance, would get 29-bit fixnums.

I haven’t looked closely into other immediate values, but it should be possible to do the same for a bunch of them - including symbols, which would be the biggest win.


Proposal to store immediate values in the object cell
New beta version of ARM uLisp with new features for feedback
#2

I considered this when I was first planning uLisp, and I’ve worked with other small Lisps that took this approach. I decided that I needed full 16-bit integers, for compatibility with the Arduino routines, and rejected the idea of switching between two representations of integers as I thought it would impose an overhead that would negate the benefit. However, it’s a good idea and I’d be very interested to see how it works out.

What do you mean by:

there is an assumption in the rest of the code that any ‘object’ pointer can be dereferenced, and I’m not sure I’ve tracked all those instances down.

If I understand this better I can probably highlight the instances for you.


#3

Full-width integers definitely need to be available; if nothing else, they give the ability to pass pointers along, which is a requirement for doing any real compilation, say on the ARM. I started with the AVR because I have a handful of ATmega328PUs lying around, and only a single SAM D21 that I haven’t really worked with before.

I haven’t done any comparison benchmarking, largely because running the tak and fib benchmarks kept triggering what I think was a GC bug I caused. I got past that issue now, though, so I should probably get on the numbers.

What I mean is that the code is written in such a way that most of the internal code uses the -> operator (directly or indirectly) quite liberally, with no checking whether that’s safe - reasonably enough. I just threw a non-safe value their way, which previously couldn’t occur on that code path. I initially thought it was relatively easy to map out where those not-really-a-pointers would go, but they made trouble in places I didn’t expect - a fun one being when 0 turned into a synonym for ‘)’ in the reader.

I’ll put the code on Github once I’ve convinced myself it’s not going to put anything on fire, should be easier to look for any remaining issues then.


#4

Look forward to seeing it!


#5

Initial numbers, using the fib function on the benchmarks page - which should exercise the GC and simple arithmetic - are interesting. Timing and garbage stats are different parameters, because I could run a bigger test for GC test than for timing.

Without the fixnums, AVR version 3.0b:

  • (fib 20) causes 1222 invocations of the GC, collecting 223 117 objects.
  • (with-millis () (fib 15)) returns 666.

With fixnums:

  • (fib 20) causes 989 invocations of the GC, collecting 202 808 objects.
  • (with-millis () (fib 15)) returns 558.

The GC runs less often, but the average number of objects collected each time rises. I’ll see if I can be arsed to do more testing, but I’d say these numbers suggest there’s not an overhead involved.

I’ve pushed the code to GitHub, under a fork:

As you’ll see, I moved around the other two uses that are made of otherwise invalid object pointers. The key element here is that a valid pointer will not have the 2-bit enabled. The GC now ignores things based on that, freeing the remaining fourteen bits. (The mark bit is still used in the same way.) Fixnums get half of that space, such that (fixnum & 6) == 2. The type identifiers as previously used end up having (type & 15) == 6. I stuffed the parser tokens in with them for now, but as all of these are used as symbolic constants in the code already, they’re easy to change if a different mapping looks better.


#6

That’s great! Did the whole of uLisp still fit on an ATmega328P?

Presumably another benefit on platforms with limited RAM, such as the ATmega328P, is that programs typically take up less space? For example:

'(1 2 3 4 5)

should use half the number of objects. I assume the biggest overhead will be in something like:

(1+ 4095)

and

(1- 4096)

David


#7

Yes, although the Arduino IDE says it’s now 97% of the available space, rather than 96%.

314> (let ((x 4095)) (room))
298

314> (let ((x 4096)) (room))
297

I’ll compare a list later, I got this through SSH.

Well, those functions work by unboxing the integer given, and then boxing the result. I didn’t actually have to touch them.

314> (for-millis () (dotimes (i 10000) (1+ 1)))
833

314> (for-millis () (dotimes (i 10000) (1+ 4095)))
873

314> (for-millis () (dotimes (i 10000) (1+ 4096)))
874

This makes sense to me; boxing and unboxing a fixnum is just a set of shifts, whereas the same for integers requires memory access. This would probably be a bigger difference on ARM, where the fixnum only takes one instruction.


#8

I had a look at your changes, and there are surprisingly few changes needed! Nice work.

I also ran your version through my test suite:

http://www.ulisp.com/show?1AA0#testsuites

It threw up a few errors; in the following lines the parameters to aeq are:

  • An identifying symbol.
  • The value that should be returned.
  • The form to be evaluated.
(aeq 'atom nil (atom '(1 2)))
(aeq 'numberp t (numberp (+ 1 2)))
(aeq 'setf 220 (let ((a '(2 3 4))) (setf (nth 1 a) 11 (nth 2 a) 10) (apply * a)))

Should be easy to fix,
David


#9

Easy enough. atom had a slight bug, I’m guessing because I had to turn the logic of some of those functions around and missed that one. numberp hadn’t been extended to also test for fixnums. I threw in a fixnump function in uLisp while I was at it.

The final line doesn’t seem to be in that form in the test suite on the web; in any case it worked for me after having applied the first fix. I’d guess the reason is that the atom call in teq went awry.

It should, but we’d presumably need to be using a debugger to count that exactly. With printgcs defined, the non-fixnum version gives me:

{0}314> '(1 2 3 4 5)
(1 2 3 4 5)

{14}314> '(1 2 3 4 5 6 7 8 9 10)
(1 2 3 4 5 6 7 8 9 10)

{24}

With the fixnums, I get:

{0}314> '(1 2 3 4 5)
(1 2 3 4 5)

{9}314> '(1 2 3 4 5 6 7 8 9 10)
(1 2 3 4 5 6 7 8 9 10)

{14}

The numbers are consistent; adding five numbers to a list adds ten objects without fixnums, and five with fixnums. There is also an increase of as many objects as the list contains between the non-fixnum and fixnum versions. Moreover, there’s a constant overhead of four objects - and I presume two of those are the symbol quote and a cons cell for it in front of the list. The following lines are all with fixnums:

{0}314> 1
1

{1}314> (quote 1)
1

{4}314> (quote (1))
(1)

{5}314> '(1)
(1)

{5}

I’m inclined to think the overhead seen here is something to do with how the reader works.