Proposed specification for format in uLisp


#1

One of the new features I propose to incorporate in an upcoming version of uLisp is support for Common Lisp’s format function. It probably won’t surprise you to learn that I’m not planning to implement the full ANSI specification of format! Instead I want to provide a subset that’s compatible with Common Lisp, and offers the features that will be most useful in embedded applications.

Here’s the proposed specification; please let me know if there are features of format I’ve omitted that you think would be desirable in uLisp:

format function

Syntax: (format output controlstring arguments*)

Outputs its arguments formatted according to the format directives in the control string.

The output argument can be one of:

  • nil: format returns the formatted output as a string.
  • t: format prints the formatted output to the serial output.
  • a stream: format prints the formatted output to the specified stream.

The control string can contain the following format directives:

Directive Name Description
~a ASCII Output as princ would print it in human-readable format.
~s Sexp Output as prin1 would print it, with escape characters.
~d Decimal Decimal, as princ would print it.
~x Hexadecimal Hexadecimal with no leading zeros.
~g General General floating-point format, as princ would print it.
~~ Tilde Just prints a ~.
~% Newline Outputs a newline.
~& Newline Outputs a newline unless already at the start of a line.

Any other characters in the control string are just output unchanged.

For example:

> (format t "The battery is ~a V or ~a %." batt (* batt 20))
The battery is 4.75 V or 95 %.

If the argument is the wrong type for the directives ~d, ~x, or ~g the argument is output as if with ~a.

Each of the directives ~a, ~s, ~d, ~g, or ~x can include a width parameter after the tilde, and the output is padded with spaces to the specified width. If the output wouldn’t fit in the specified width it is output in full, and overflows the width. The ~a and ~s directives pad with spaces to the right; all the other directives pad with spaces to the left. In uLisp this is the only difference between ~a and ~d or ~g.

The width parameter will be useful for creating tabular data, or formatting output to be displayed on output devices such as seven-segment displays; for example:

> (format nil "~2d-~2d-~2d" hr min sec)
" 1-45- 7"

#2

Well, saying ‘specification’ rather than ‘documentation’ changes the context a bit, so I have a question: What happens if an undefined format directive is given?

I can see three options: The first is that tildes that aren’t parts of defined directives are passed through; the second that such sequences are errors; and the third is that it depends on the implementation. Obviously, this last one only makes sense if there is a distinction between uLisp-the-language and uLisp-the-implementations. This is why I think the use of ‘specification’ rather than ‘documentation’ matters.

I do think that it makes sense to draw such a distinction, and point to my uLisp-in-Common-Lisp layer as one possible use of that distinction.


#3

Good question. My opinion is that it would be best if unimplemented format directives, and unimplemented parameters in the implemented directives, gave an error, for the following reasons:

It would be consistent with the other aspect’s of uLisp’s implementation.

Also, I think it might be very difficult to debug errors if unimplemented directives were passed through.

The aim is to make it possible to run uLisp programs under Common Lisp and get identical behaviour, but not to run Common Lisp programs under uLisp and get an error-free approximation to the behaviour, even if that was possible.


Common Lisp compatibility
#4

Yep, best is error on unimplemented / erraneous. Don’t leave the user under the impression that everything is “fine”.


#5

The last example I gave in the specification has persuaded me that being able to define a pad character is important, so I’ve added an optional pad character to the ~d, ~x, and ~g directives. For example, you will be able to write:

> (format nil "~2,'0d-~2,'0d-~2,'0d" 1 45 7)
"01-45-07"

#6

This is precisely my objective with the CL shim. It’s made easier by the fact that much of the effort behind creating Common Lisp was figuring out how to make divergent dialects of Lisp able to coexist in the same image.

Incidentally, the summary on the language reference page is missing the list function for some reason.


#7

You’re right! Thanks.