~skeeto/public-inbox

1

RE: strcpy

Details
Message ID
<20220609111342.wbwwicrwway52wok@gen2.localdomain>
DKIM signature
pass
Download raw message
Hi Chris,

Recently came across your article on strcpy and the reddit discussion
around it. Unsurprisingly, usage of `mem*` functions on "strings"
remains controversial, despite these `mem*` functions being declared in
<string.h>.

About a week before finding your post, I was re-thinking my stance on
strlcpy[0] due to visiting this old thread[1] where Ulrich Drepper rejects
the proposal to add strlcpy to glibc.

After some thought, I came to a similar conclusion that strlcpy is very
much useless function which can be replaced, depending on the situation
(i.e weather truncation matters or not), by memccpy, strdup and memcpy.

Also regarding strscpy (since I saw this come up on the reddit thread),
ignoring the "string changing while copying" part, and focusing only on
the API; I'm not a fan of it either.

If you get an `-E2BIG` return value, you don't know weather any copying
actually occurred or not, since it simply might return that due to `n`
being 0 or >INT_MAX [2].

I feel like any string copy function which tries to warn the user about
truncation is inherently broken. In cases where truncation matters, user
simply shouldn't be using a fixed size buffer to begin with.

Perhaps a function similar to getline, which would take a malloc-ed
pointer and realloc it as needed, would be more useful in the
"truncation-matters" scenarios.

[0]: https://nrk.neocities.org/articles/not-a-fan-of-strlcpy.html
[1]: https://sourceware.org/legacy-ml/libc-alpha/2000-08/msg00053.html
[2]: https://github.com/torvalds/linux/blob/6bfb56e93bcef41859c2d5ab234ffd80b691be35/lib/string.c#L181

- NRK

Re: strcpy

Details
Message ID
<20220609164940.e266ol5cmp42gq24@nullprogram.com>
In-Reply-To
<20220609111342.wbwwicrwway52wok@gen2.localdomain> (view parent)
DKIM signature
missing
Download raw message
I enjoyed your article about strlcpy so much I've added a link to it from 
my own article, first paragraph.

More broadly, C programs should quickly get out of the null-terminated 
business and only convert back at interface boundaries requiring it. Think 
in terms of buffers rather than string objects, track lengths, allocate in 
batches/waves (Casey Muratori calls it "grouped element thinking"), and 
don't manage many individual string lifetimes (malloc, strdrup). (Where 
"many" means some number related to the input size.) Then the matter of 
the str* functions is irrelevant before it comes up.

Note: Tracking lengths doesn't necessarily mean placing a size_t next to 
every char pointer. It may be part of a holistic program design and memory 
management strategy.

> I feel like any string copy function which tries to warn the user about 
> truncation is inherently broken

I completely agree, and this is a subject I've been meaning to write about 
for some time now. Parallel to your article: If truncation is permitted, 
you should have already been prepared for it, and if it's not permitted 
then the program is defective. Either way, reaching that point means an 
invariant has been violated and the program should abort (a la assert, 
segfault) before causing further damage.

A better-designed Annex K would abort rather than return errors if it 
detects its inputs are invalid, since the program is in an unexpected 
state. If it was expected then you wouldn't need the warning. Such 
sensitivity to invalid states is good since it catches defects more 
quickly and more reliably.
Reply to thread Export thread (mbox)