Great article as always!
On 10/6/23 00:05, Chris Wellons wrote:
> *A simple, arena-backed, generic dynamic array for C*> https://nullprogram.com/blog/2023/10/05/> > Copying over old data includes a special check for zero-length inputs, > because, quite frustratingly <https://nullprogram.com/blog/2023/02/11/>, > |memcpy| does not accept null even when the length is zero.
Do you know what the motivation for this is? I too have tripped over
this in the past and I cannot think of any useful compiler optimisation
or platform behaviour that is enabled by this assumption.
Re: A simple, arena-backed, generic dynamic array for C
Breaking the rules and replying to my own post as I learned some things
that may answer the question and be of interest to you/others.
On 10/10/23 10:01, Matthew Fernandez wrote:
> On 10/6/23 00:05, Chris Wellons wrote:>> *A simple, arena-backed, generic dynamic array for C*>> https://nullprogram.com/blog/2023/10/05/>> Copying over old data includes a special check for zero-length inputs, >> because, quite frustratingly >> <https://nullprogram.com/blog/2023/02/11/>, |memcpy| does not accept >> null even when the length is zero.> Do you know what the motivation for this is? I too have tripped over > this in the past and I cannot think of any useful compiler optimisation > or platform behaviour that is enabled by this assumption.
Today I tripped over `qsort(NULL, 0, …)` being UB and had the same
confusion. I went down the rabbit hole and surfaced:
[0] Paul Eggert eggert at cs.ucla.edu, Wed Oct 26 17:23:44 UTC 2022
> Although qsort(NULL, 0, ...) works fine on most practical platforms, as > Clive noted there is (or was) an oddball platform or two that is (or > was) verrrry picky about pointers, and for a program like zic where > portability is more important than performance, it's nice if zic runs > even on oddballs. Clive, do you happen to know what these platforms are > (or were), and whether they're still supported?
[1] Clive D.W. Feather clive at davros.org, Thu Oct 27 06:38:42 UTC 2022
> Sorry, but I have no idea. We didn't always get told - if someone at the> meeting said "I know an platform that does X", we believed them.> > There was a longstanding meme in the meetings (and on comp.std.c) that "if> you don't understand the reasoning behing something in the C Standard, the> answer is the IBM AS400".
[0]: https://mm.icann.org/pipermail/tz/2022-October/032107.html
[1]: https://mm.icann.org/pipermail/tz/2022-October/032115.html
Re: A simple, arena-backed, generic dynamic array for C
I don't know the history, but C and C++ generally regard null pointers as
super special — very much not a valid pointer to a zero-sized object. I
hadn't thought about it with qsort before, but other cases include fwrite
and fread. C++ takes it even further: When the standard says a function
accepts an "array" (e.g. std::ostream::write) you're not even allowed to
use zero as a count/nmemb with a non-null pointer because arrays cannot be
zero length. It's nuts, like they were trying to make gotchas.
Though whatever historical circumstance established this should not matter
today, GCC uses it as an optimization hint. If a pointer is passed to one
of these functions, then it's assumed not null, and that hint propagates
forward potentially removing null pointer checks and such.
A half workaround is to write your own copy/move/set functions with your
own names and sane semantics, then use those instead. However, you still
have to watch for issues like adding zero to a null pointer which, again,
_should_ be something you could do. It's natural and eliminates a bunch of
needless edge case checks, because they're not _really_ edge cases. But
it's used as an optimization hint, so you can't do it.
I wish these standard functions accepted null pointers when given zero
counts, and that adding/subtracting zero with a null pointer was defined
to yield a null pointer. It would simplify a great deal of code, eliminate
a ton of real world bugs, and probably cost virtually nothing in missed
optimizations. (I suspect most cases when a genuine slowdown is observed,
it's because the program had undefined null pointer use.) I'm fine with
leaving non-null offsets and counts on null pointers as undefined.
Re: A simple, arena-backed, generic dynamic array for C
On 10/12/23 13:15, Christopher Wellons wrote:
> Though whatever historical circumstance established this should not > matter today, GCC uses it as an optimization hint. If a pointer is > passed to one of these functions, then it's assumed not null, and that > hint propagates forward potentially removing null pointer checks and such.
In my prior reading, I came across a claim that the non-null requirement
allowed a pre-fetching optimisation¹ but I wrote this off as spurious
because I believe pre-fetching a non-null address at which there are 0
array elements can still trap on some platforms. Your theory makes much
more sense; thanks! IIUC you’re saying, though `qsort` may not directly
use the pointer when `nmemb` is 0, it may call another function that
performs non-null checks. With this assumption in the standard, the
callee can be inlined/specialised to remove the check. Nice!
¹ https://mm.icann.org/pipermail/tz/2022-October/032111.html> A half workaround is to write your own copy/move/set functions with your > own names and sane semantics, then use those instead.
Ah yes, I guess this in itself is a reasonable supporting argument too.
If the user really wants no UB here, the wrapper they need to implement
is no less efficient than the equivalent library implementation would be.
Re: A simple, arena-backed, generic dynamic array for C
Hi Matthew,
> In my prior reading, I came across a claim that the non-null> requirement allowed a pre-fetching optimisation
GCC's `__builtin_prefetch()` allows the argument to be NULL. And GCC has
been ported to a good amount of platforms. So I doubt that's a valid
reason.
> `qsort` may not directly use the pointer when `nmemb` is 0, it may> call another function that performs non-null checks.
This is less about "calling another function" and more about GCC's
`nonnull` attribute (or similar). For example:
extern void f(void *p) __attribute(( nonnull(1) ));
int isNull(void *p)
{
f(p);
return p == 0;
}
In the above snippet, the nonnull attribute in the function prototype
informs GCC that the argument 1 (p in this case) cannot be null. And
since the `isNull` function calls `f(p)`, GCC assumes that p is non-null
and will elide the `p == 0` check and always return 0:
https://godbolt.org/z/no35fP9fG
Also note that the compiler can assume that arguments to qsort/memcpy
etc are non-null *even if the libc doesn't mark it as such* since the
standard says it's undefined.
> If the user really wants no UB here, the wrapper they need to> implement is no less efficient than the equivalent library> implementation would be.
I have extremely high doubts that the speed benefit of removing a couple
null checks outweight the development friction created by them. AFAIK,
linux kernel outright disables this "optimization" by passing
`-fno-delete-null-pointer-checks` explicitly.
- NRK