Better output from Builder


#1

It would certainly make the GitHub diffs a lot easier to understand if the lists of symbol strings and doc strings had nice names.

Is there a reason why the builder is still outputting not-very-useful names like string122 ?

It would be a lot easier to understand and modify if the variables were instead called stuff like string_withsdcard or something like that.

The transformation from Lisp name to C++ name doesn’t have to be too complicated; the simplest is probably to replace * with _ and remove the -. So for example for *features* the variable would be called string__features_ which is a lot clearer than string19.

What do you think?


#2

Yes, I could easily do that - I didn’t realise that it would make a difference to anyone. I will try and make the change before I generate the next set of source files.


#3

While you’re at it, I just thought of another modification code-wise that might be helpful to people who want to mod uLisp (which is like, half of its users from looking at this forum):

Instead of doing four separate sections in the source for the functions, the symbol strings, and the docstrings like this:

// ----- functions section ---- //
object* fn_foo(object* args, object* env) {
    // ...
}

object* fn_bar(object* args, object* env) {
    // ...
}

// ----- strings section ---- //
const char string_foo[] = "foo";
const char string_bar[] = "bar";

// ----- docstrings section ---- //
const char doc_foo[] = "Doc for foo";
const char doc_bar[] = "Doc for bar";

// ----- table section ---- //
const tbl_entry_t lookup_table[] = {
    { string_foo, fn_foo, 0123, doc_bar },
    { string_bar, fn_bar, 0156, doc_bar },
};

you could put the tbl_entry_t entries and strings and stuff interleaved next to the functions, which would make the number of sections that someone would have to scroll back and forth to only two, down from the current four:

// ----- functions section ---- //
object* fn_foo(object* args, object* env) {
    // ...
}
const char string_foo[] = "foo";
const char doc_foo[] = "Doc for foo";
tbl_entry_t tb_foo = { string_foo, fn_foo, 0123, doc_foo };

object* fn_bar(object* args, object* env) {
    // ...
}
const char string_bar[] = "bar";
const char doc_bar[] = "Doc for bar";
tbl_entry_t tb_bar = { string_bar, fn_bar, 0156, doc_bar };

// ----- table section ---- //
const tbl_entry_t lookup_table[] = {
    tb_foo,
    tb_bar,
};

A quick test showed that doing this doesn’t add any more bytes to the executable size.

I would have taken it one step further and inlined everything including the function itself by using a C++ lambda function, but strangely this added about 200 bytes to the executable size. Not sure why.


#4

I think there are pros and cons to doing this. One disadvantage is that it will make the source longer, and is it really so difficult to refer to three different sections?

Would anyone else find this useful?


#5

Longer does not necessarily mean less readable. As is, each function is made up of those four parts (ignoring the BUILTINS enum, but that’s a bit harder to get rid of), so why not put them next to each other so that as many parts as possible would fit on the same screen together? My main concern was scrolling, since even the shortest section (the strings) is still several screenfuls tall because of the sheer number of functions, and scrolling takes time proportional to the distance scrolled. Ctrl+ F is faster but still takes time, and is a little more awkward if you’re using the 1.x branch of the Arduino IDE.


#6

Yes, I generally agree with that. Let’s see if anyone else would find this useful …


#7

I will propose a slight alternative/compromise where we group the functions, strings, and docs together, but put them in the table as usual. So it would look like:

// ----- functions section ---- //
object* fn_foo(object* args, object* env) {
    // ...
}
const char string_foo[] = "foo";
const char doc_foo[] = "Doc for foo";

object* fn_bar(object* args, object* env) {
    // ...
}
const char string_bar[] = "bar";
const char doc_bar[] = "Doc for bar";

// ----- table section ---- //
const tbl_entry_t lookup_table[] = {
    { string_foo, fn_foo, 0123, doc_foo },
    { string_bar, fn_bar, 0156, doc_bar },
};

It’s more jumping around than making standalone table entries but doesn’t add length to the code.

I do think having the functions, names, and docs grouped together would make it easier to work with since I rarely care about reading the docs or names sequentially but always want to see the name and docs of the function I’m using, and its annoying sometimes because finding their related parts involves searching for several different terms eg. copy-list -> string94 -> fn_copylist. Searching copy-list wouldn’t work to find fn_copylist (though the initial proposal would help alleviate that by naming the strings better?)


#8

That would work too, the only reason I suggested giving the table entries variable names is that doing that makes getting rid of the builtins enum a bit easier (although I haven’t worked out the details yet).


#9

Actually (sorry for the double post) I looked at this again and there is an advantage to giving the table entries variable names: the minmax entry is also next to the function. That is kind of important when you’re doing stuff with the function.


#10

Yea I considered that aspect but believe the longer code wasn’t worth it for putting the minmax entry next to the function. I find the documentation string works well enough to show what the function does so you don’t need to look at the minmax, and it’s rarely edited in my experience.


#11

@hasn0life do you agree with the first suggestion, to give the strings more useful names; for example:

const char string_defun[] = "defun";

instead of the current:

const char string19[] = "defun";

#12

Yes I think it would be helpful when searching for what functions are called. If you want you can also probably generate a comment which shows their index if that’s important, perhaps something like:
/*19*/ const char string_defun[] = "defun"