Interesting, thanks! So the foundations are there 30 years ago. That goes
along with virtual memory and plenty of RAM becoming common, after the era
of cramped "microcomputers." Though perhaps memory wasn't _so_ plentiful
since it's using a chained memory blocks design.
Last month I saw an early (1988) use of the word "arena" with a similar
design, though it doesn't mention, as you highlighted in that book, how it
simplifies the code:
Fast Allocation and Deallocation of Memory Based on Object Lifetimes
https://www.cs.princeton.edu/techreports/1988/191.pdf> In the very next page it even includes a `NEW` macro
Fascinating how similar it is. It also avoids repeating the type, though
it can't be used in initialization — which, of course, wasn't happening
much in the C89 days anyway.
> Probably also illustrates just how much influence over the programmer > the defaults of a language has - very few look outside that box.
Great observation! I suspect educational materials play a role, too. They
all follow the same, conventional route, teaching a tradition rather than
effective practice derived from experience. Almost none mention debuggers,
and none ever mention sanitizers. The famous CS50 course has a Makefile
with "-Wall -Wextra", which puts it above average, but along with omitting
sanitizers, they don't even enable debugging information.
The "untangling lifetimes" article touches on some of this.
> I was wondering if you had any other highly practical resources on > compilers (particularly modern optimizing compilers) like this.
Sorry, I don't. I'm definitely going to check out "A Retargetable C
Compiler" though!
> After a bit of inspection, I'm currently experimenting with the > following two macros instead
"Zero single items and don't initialize arrays" fits my own observations,
too. I've tried splitting the difference, keeping a single interface which
always initializes the first element. Feels a little too magical, though.
My current thinking is that rarely can I measure performance cost of zero
initialization. It's so cheap. I might as well simplify my program by
eliminating init/noinit as a concern until it makes a real difference.
> Fast Allocation and Deallocation of Memory Based on Object Lifetimes> https://www.cs.princeton.edu/techreports/1988/191.pdf
I noticed that paper pop up in the C subreddit but I hadn't looked at
it. Skimming through it now, it looks like a carbon copy of the scheme
used in `lcc`! At the end in the acknowledgement section I noticed:
| Chris Fraser's comments helped clarify several sections of this paper.
And then scrolling back up to the top I see the author of the paper is
David R. Hanson. If you now take a look at the authors of "Retargetable
C Compiler" it will become obvious that this wasn't an accident :)
> It also avoids repeating the type, though it can't be used in> initialization
At first I thought so too, but as ridiculous it may look, the following
works:
char *p = (p = malloc(4));
Since AFAIK there's a sequence point after an assignment statement, it
should be well defined behavior too. (If not, there's also comma
operator which can force a sequence point, although that would require
evaluating `p` twice).
I might experience around a bit with this macro. Even though I generally
avoid doing nasty things like control-flow or assignments inside of
macros.
- NRK
- - -
(My original mail got rejected by the mailing list due to the png
attachment. Copy pasting it below for archival purposes.)
- - -
I was recently skimming through the book "A Retargetable C
Compiler" and saw a couple interesting things that might
interest you. Instead of beginning with lexer like most other
compiler books, it starts off by introducing the memory
management scheme: different arenas for different lifetimes.
An excerpt from chapter 2 (emphasis added):
| Most implementations of malloc use memory-management algorithms
| that are necessarily based on the sizes of objects. Algorithms
| based on object lifetimes are more efficient - if all of the
| deallocations can be done at once. Indeed, stacklike allocation
| would be most efficient, but it can be used only if object
| lifetimes are nested, which is generally not the case in
| compilers and many other applications.
|
| This chapter describes lcc's storage management scheme, which is
| based on object lifetimes. In this scheme, allocation is more
| efficient than malloc, and the cost of deallocation is
| negligible.
| But the real benefit is that this scheme simplifies the code.
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| Allocation is so cheap that it encourages simple
| applicative algorithms in place of more space-efficient but
| complex ones. And allocation incurs no deallocation obligation,
| so deallocation can't be forgotten.
In the very next page it even includes a `NEW` macro, very
similar to the one you use (except it takes the pointer being
assigned as argument instead of type):
#define NEW(p,a) ((p) = allocate(sizeof *(p), (a)))
#define NEW0(p,a) memset(NEW((p),(a)), 0, sizeof *(p))
Interesting to see how better memory-management techniques have
been known for decades but never really "trickled-down" to the
masses. Probably also illustrates just how much influence over
the programmer the defaults of a language has - very few look
outside that box.
Another thing I find very interesting about this book is that -
unlike most other compiler books which are either academic (all
theory) or builds something toy and not suitable for real-world
usage - this one builds an actual production grade compiler. I
was wondering if you had any other highly practical resources on
compilers (particularly modern optimizing compilers) like this.
P.S: I was also recently experimenting with zero-initializing
allocations by default but noticed a good ton of cases where I was
manually opt-ing out of it. After a bit of inspection, I'm
currently experimenting with the following two macros instead:
arena_item(Arena *, T) // allocates a single zero-initialized object of type T
arena_array(Arena *, T, Size cnt) // allocates uninitialized array of type T with `cnt` elements
This may seem a bit inconsistent but it's actually consistent
with my usage code where zero-initializing single object
allocation is either preferred or harmless but zero-initializing
array is almost never wanted/required.
> it will become obvious that this wasn't an accident :)
Ha, so these aren't so independent as I had thought!
> as ridiculous it may look, the following works
Ah, nice workaround. Your reasoning about its correctness sounds right to
me. Though while it doesn't repeat the type, it repeats the variable name.
> My original mail got rejected by the mailing list due to the png > attachment.
I was trying to figure that out myself before I came to that realization.
The recent major sourcehut outage was long enough to lose list messages. I
was able to bounce some back into the list, where they now appear out of
order, but at least they're there. I thought maybe this was related, so I
tried to bounce yours, which were accepted by sourcehut SMTP but never
appeared. Sorry if you got some backscatter from that!
Perhaps it would be nicer if sourcehut SMTP rejected messages with
disallowed attachments if only because I'd get feedback when bouncing. On
the other hand, could it explain the rejection well? So that might require
an asynchronous rejection anyway. It would also be nicer if, on the
website, messages were ordered by their date header, not by their arrival
order at sourcehut. (That could be abused, but there are tools to deal
with abuse.)
> I might experiment around a bit with this macro.
Small update, I've tried this out on some scratch projects. Overall, I
don't like it. 2 major issues:
* It's difficult (for me, at least) to read code where assignments are
being hidden behind macros.
* Doesn't quite work (without casting) when the pointer you want to
assign to is a void pointer.
> char *p = (p = malloc(4));>> Since AFAIK there's a sequence point after an assignment statement
Also a small correction, assignments don't have sequence point,
*initializer* does. And so..
char *p = (p = malloc(4));
..is correct. But
char *p;
p = (p = malloc(4));
is not, since it's not an initializer.
- NRK
Thanks for the followup!
> It's difficult (for me, at least) to read code where assignments are > being hidden behind macros.
Same for me, so I'd probably dislike it for the same reason. Though at
least it's not a worse case: when variable declarations are hidden inside
macros.