~sircmpwn/hare-rfc

11 7

[RFC v1] Replace strerror with write_error

Details
Message ID
<D3DK3ZHDTLDB.5A30WGFTKYV3@sebsite.pw>
DKIM signature
pass
Download raw message
                              RFC SUMMARY

Currently, by convention, Hare modules which provide an error type also
provide a strerror function, which takes in an error and returns a
string describing the error. This approach sounds sensible enough, but
it has significant limitations:

For one, if the potential length of an error message isn't bounded, the
message can't be safely stored in a static buffer. Thus, the only two
options are to truncate the string, or to heap-allocate the message.
Both of these are suboptimal, since most of the time, the error message
will be printed to an io::handle anyway (usually stderr). Heap
allocation also isn't compatible with nomem, unless the function aborts
when out of memory, which shouldn't be necessary.

Even beyond this issue, it's just an unnecessarily limiting interface.
If the error message isn't fixed (i.e. a string literal), it needs to be
stored in a static buffer before being returned to the user. As I'll
demonstrate, this middle step is completely unnecessary, and we can give
the user direct control over how and where the error message is
allocated (if it's allocated anywhere at all, which it often won't be).

As we try to improve the quality of error messages within Hare, the
strerror design will become more and more problematic, since it's
optimized for string literals.

                         ECOSYSTEM IMPLICATIONS

The "strerror" convention is dropped. Instead, the convention is to
supply a "write_error" function, with the following prototype:

	export fn write_error(h: io::handle, err: error) (size | io::error);

Generally, callers of the function will either use error assertion (when
the write can't fail) or discard the result (when the write could fail,
but it doesn't make sense to handle it).

                              EXAMPLE CODE

Currently, strerror is commonly used to print an error to stderr before
terminating the program, like so:

	fmt::fatal("Error:", strerror(err));

Here's what the equivalent logic looks like with write_error:

	fmt::error("Error: "): void;
	write_error(os::stderr, err): void;
	fmt::fatal(); // print newline + terminate

(In the future we'll likely replace void casts with assignments to `_`,
but the semantics are the same.)

If you actually want to store an error message string, you could use a
fixed buffer stream:

	static let msg: [256]u8 = [0...];
	let msg = memio::fixed(msg);
	write_error(&msg, err)!;
	let msg = memio::string(msg)!;

Here's an example strerror (status quo) implementation. Note the extra
logic needed in the error_b branch when the error message exceeds the
size of the buffer, even though the string will most likely be written
to another stream after the function returns:

	export fn strerror(err: error) str = {
		match (err) {
		case error_a =>
			return "Error A";
		case let err: error_b =>
			static let buf: [256]u8 = [0...];
			if (len(err.string) <= len(buf) - len("Error B")) {
				return fmt::bsprint("Error B", err.string);
			} else {
				return "Error B";
			};
		};
	};

Here's what a write_error implementation would look like for the same
type:

	export fn write_error(h: io::handle, err: error) (size | io::error) = {
		match (err) {
		case error_a =>
			return fmt::fprint(h, "Error A");
		case let err: error_b =>
			return fmt::fprint(h, "Error B", err.string);
		};
	};

For a real-world example of a place where a strerror function isn't
used at all because of the buffer size issue, see cmd/hare/main.ha. The
logic for terminating with an error is contained within main() itself,
rather than a strerror function for `error`:

	match (task(subcmd.0, subcmd.1)) {
	case void => void;
	case let e: exec::error =>
		fmt::fatal("Error:", exec::strerror(e));
	case let e: fs::error =>
		fmt::fatal("Error:", fs::strerror(e));
	// ...
	case let e: unknown_type =>
		fmt::fatalf("Error: Unknown build type: {}", e);
	case let e: output_exists =>
		fmt::fatalf("Error: Output path {} already exists, but isn't an executable file",
			e);
	case let e: output_failed =>
		fmt::fatalf("Error: Could not open output '{}': {}",
			e.0, fs::strerror(e.1));
	case let e: invalid_namespace =>
		fmt::fatalf("Error: Invalid namespace: {}", e);
	};

With write_error, this can be refactored to use a conventional
interface:

	match (task(subcmd.0, subcmd.1)) {
	case void => void;
	case let err: error =>
		fmt::error("Error: "): void;
		write_error(os::stderr, err): void;
		fmt::fatal();
	};

	// Outside of definition of main

	export fn write_error(h: io::handle, err: error) (size | io::error) = {
		match (err) {
		case let e: exec::error =>
			return exec::write_error(h, e);
		case let e: fs::error =>
			return fs::write_error(h, e);
		// ...
		case let e: unknown_type =>
			return fmt::fprintf(h, "Unknown build type: {}", e);
		case let e: output_exists =>
			return fmt::fprintf(h, "Output path {} already exists, but isn't an executable file",
				e);
		// ...
		};
	};

Note that the build driver is already going out of its way to not use
strerror, and instead print the error message directly. Modules which
aren't internal to a specific program don't have this option (with the
strerror convention), resulting in either an incorrect strerror
implementation (aborts if the message is too long), or a suboptimal one
(aborts on OOM, or silently truncates the message).

In fact, incorrect strerror implementations are *very* common, even in
the stdlib. Here's hare::lex::strerror:

	export fn strerror(err: error) const str = {
		static let buf: [2048]u8 = [0...];
		match (err) {
		case let err: io::error =>
			return io::strerror(err);
		case let s: syntax =>
			return fmt::bsprintf(buf, "{}:{}:{}: syntax error: {}",
				s.0.path, s.0.line, s.0.col, s.1);
		};
	};

Notice that the buffer size is smaller than path::MAX, so not all error
messages will fit into this buffer.

A while ago, time::chrono::strerror had a guaranteed memory leak, since
the returned error string was sometimes a statically allocated string
literal, and sometimes heap-allocated:

	export fn strerror(err: error) const str = {
		match (err) {
		case invalid =>
			return "Invalid moment";
		case invalidtzif =>
			return "Invalid TZif data";
		case let err: tzdberror =>
			match (err) {
			case let err: fs::error =>
				return fmt::asprintf(
					"Timezone database error: {}",
					fs::strerror(err));
			case let err: io::error =>
				return fmt::asprintf(
					"Timezone database error: {}",
					io::strerror(err));
				};
			// ...
			};
		// ...
		};
	};

strerror is also problematic in hare-c. Here's what c::check::strerror's
implementation looks like:

	export fn strerror(err: semantic) str = {
		static let buf: [1024]u8 = [0...];
		return fmt::bsprintf(buf, "{}:{}:{}: {}",
			err.loc.path, err.loc.line, err.loc.col, err.msg);
	};

This implementation is incorrect, because, in the case of static_assert,
the error message's length is unbounded. The C standard allows
implementations to limit the maximum length of string literals (with a
minimum limit of 4095 characters), but hare-c avoids enforcing hard
limits where possible. The same issue is present in c::parse::strerror,
when the #error directive is used.

By contrast, here's what c::check::write_error would look like:

	export fn write_error(h: io::handle, err: semantic) (size | io::error) = {
		return fmt::fprintf(h, "{}:{}:{}: {}",
			err.loc.path, err.loc.line, err.loc.col, err.msg);
	};

The implementation is nearly identical to strerror, except the message
is formatted to an io::handle rather than a static buffer. This solves
the aforementioned issue, since the caller can now decide how to handle
error messages (print directly, write to a fixed size buffer and
truncate if necessary, heap-allocate, etc.)

                     STANDARD LIBRARY IMPLICATIONS

The changes in convention described above are applied to all stdlib
modules which currently supply a strerror function.

Two exceptions need to be made: the rt:: and errors:: modules can't
provide a write_error function, because they're depended on by io::. For
the time being, they can stick with strerror (though we should rename
errors::strerror to errors::string, since there's no longer a convention
that needs to be conformed to). This isn't too much of an issue, since
the documentation for errors::strerror already advises against its use,
encouraging modules to supply their own more informative error messages.
Changing the convention to write_error will make modules more likely to
actually follow this advice. As for rt::strerror, it's currently only
used in one place: errors::errno. And since rt::errno is almost always
converted to errors::error before being returned by a higher-level
module, this is a non-issue for rt:: as well.

Certain stdlib modules, like unix::tty::, will need to change
their interfaces to return a module-specific error type rather than
errors::error (even if this RFC is rejected, we should still do this).
Details
Message ID
<2b261ef9-0ca5-48f1-be94-21ccde961f6c@strohwolke.at>
In-Reply-To
<D3DK3ZHDTLDB.5A30WGFTKYV3@sebsite.pw> (view parent)
DKIM signature
pass
Download raw message
I really like this approach. Good Idea!

On 8/12/24 3:53 AM, Sebastian wrote:
> 	fmt::error("Error: "): void;
> 	write_error(os::stderr, err): void;
> 	fmt::fatal(); // print newline + terminate

Can fmt be extended to allow such an error function to be passed as 
formattable argument so that the following would be possible?

  	fmt::fatal("Error:", &write_error);
	fmt::fatalf("Error: {}", &write_error);
Details
Message ID
<D3DSVKWPG2RU.1U5NXQZR90UFH@torresjrjr.com>
In-Reply-To
<2b261ef9-0ca5-48f1-be94-21ccde961f6c@strohwolke.at> (view parent)
DKIM signature
pass
Download raw message
+1 to Sebastian's proposal, regardless of this fmt suggestion.

On Mon Aug 12, 2024 at 7:26 AM BST, Armin Preiml wrote:
> I really like this approach. Good Idea!
>
> On 8/12/24 3:53 AM, Sebastian wrote:
> > 	fmt::error("Error: "): void;
> > 	write_error(os::stderr, err): void;
> > 	fmt::fatal(); // print newline + terminate
>
> Can fmt be extended to allow such an error function to be passed as 
> formattable argument so that the following would be possible?
>
>   	fmt::fatal("Error:", &write_error);
> 	fmt::fatalf("Error: {}", &write_error);

On first thought, yes.

type field = (...formattable | *mods);

type formattable = (...types::numeric | uintptr | str | rune | bool | nullable *opaque |
        void);

We could extend fmt::field to include a theoretical `*errors::writer`.
However, that could conflict with fmt::formattable's `nullable *opaque`
member. But then, we already have fmt::field containing `*mods`.
Details
Message ID
<D3DV9PJQ9MI1.3JLJK9345HAHE@cmpwn.com>
In-Reply-To
<D3DK3ZHDTLDB.5A30WGFTKYV3@sebsite.pw> (view parent)
DKIM signature
pass
Download raw message
Pretty strong NACK.

My main rationale for NACK'ing this is citing worse is better. strerror
is not as comprehensive for all error scenarios, but it is *damn*
convenient for the vast majority of error scenarios.

Most error handling scenarios are simple and do not suffer especially
much from the strerror convention. Something like a compiler (e.g.
hare-c) has much more complex error handling and reporting requirements,
perhaps by an order of magnitude, than the standard case. As the wisdom
goes, a convention should handle 95% of cases and the remaining 5%
should defy the convention.

I would sooner prefer that some modules or use-cases for error handling
that exceed the bounds of strerror should establish a parallel
convention for stringifying errors. One example of how this would look
is that something like hare-c just handles errors in a substantially
different manner than the convention affords. For instance, it might be
nice to give harec-style contextual errors like so:

Error: assignment to 'int' from 'char *' makes integer from pointer without a cast
3 |     int x = "Hello world";
	      ^

I think it's well beyond the remit of strerror (or even the proposed
write_error) to be doing things like opening files to fetch context
strings. Nevertheless, it's well within the scope of a tool like hare-c
to report errors in this manner.

As an alternative approach, I would suggest retaining the strerror
convention, but taking it as a given constraint that strerror cannot
always return the best errors. It might use a static buffer (never heap
allocating) and truncate the error, or might simply return a constant
string that lacks context. We can establish *alongside* strerror a
conventional function like write_error which can represent errors with
more detail -- supplementing strerror only when required but retaining
strerror for the simple approach which handles 95% of cases. (I might
also suggest wrerror as a better name than write_error if this is
established as an ecosystem-wide convention for stronger error
handling).

Also, as Seb notes, adding a dependency on io for essentially all
modules is not something I'm totally comfortable with.
Lorenz (xha) <me@xha.li>
Details
Message ID
<Zr2y9klb8BZoDX4k@xha.li>
In-Reply-To
<D3DK3ZHDTLDB.5A30WGFTKYV3@sebsite.pw> (view parent)
DKIM signature
pass
Download raw message
i don't have much to add here, i mostly agree with what that what
drew said except that i don't see it as problematic as he does. i
really like your idea but also see the downside of depending on
io:: everywhere, and that strerror is quite convinient.

i think that the main problem with errors is not really the formatting,
but the context in which the error occoured. the hare::module
hackery with heap-allocated error messages is a good example.

in some cases, we might be able to use

type someerror = [path::MAX]u8;

however, this is quite problematic, especially on linux where
PATH_MAX is 4096.

another thought that i've had is that we might want to somehow
attach the line and file (number) to types with error flags themselves.

i don't know what the performance impact of having a tagged union
with one very large (128 bytes or something) error type is. we could
somehow "standardize" a length for error messages, and everything
above that is probably not useful for the user anyways.

maby we could even have some special space, somewhere at the beginning
of the stack or in the binary, reserved for these kind of error
types. that could maby improve the performance because that would
mean function local variables are not that far apart which in turn
would mean that cache misses would not be a problem with these large
[path::MAX]u8 error types anymore. but that'd have to be a builtin
into the language i guess, i am not sure if that'd be worth it.

maby allowing to attach line and file number to error types when
needed is the best option so it's at least possible to know where
the error occoured.
Details
Message ID
<D3GC9PNZCSO7.25UNC63KK0GXL@cmpwn.com>
In-Reply-To
<Zr2y9klb8BZoDX4k@xha.li> (view parent)
DKIM signature
pass
Download raw message
Is it better to conventionally keep error types short/small, with
minimal context, and if we need to gather additional context like paths
we establish a separate error reporting mechanism, like some kind of
"error collector" passed around modules that report them? Not
standardized or implemented in the stdlib, just conventional.
Lorenz (xha) <me@xha.li>
Details
Message ID
<Zr3mTB-4M2I-uvdU@xha.li>
In-Reply-To
<D3GC9PNZCSO7.25UNC63KK0GXL@cmpwn.com> (view parent)
DKIM signature
pass
Download raw message
On Thu, Aug 15, 2024 at 10:23:19AM +0200, Drew DeVault wrote:
> Is it better to conventionally keep error types short/small, with
> minimal context, and if we need to gather additional context like paths
> we establish a separate error reporting mechanism, like some kind of
> "error collector" passed around modules that report them? Not
> standardized or implemented in the stdlib, just conventional.

yeah, i think this is a good idea. but i broke my head when trying
to think about how to implement this.
Lorenz (xha) <me@xha.li>
Details
Message ID
<Zr3nsYvxRJtWd0NW@xha.li>
In-Reply-To
<Zr3mTB-4M2I-uvdU@xha.li> (view parent)
DKIM signature
pass
Download raw message
On Thu, Aug 15, 2024 at 01:28:12PM +0200, Lorenz (xha) wrote:
> On Thu, Aug 15, 2024 at 10:23:19AM +0200, Drew DeVault wrote:
> > Is it better to conventionally keep error types short/small, with
> > minimal context, and if we need to gather additional context like paths
> > we establish a separate error reporting mechanism, like some kind of
> > "error collector" passed around modules that report them? Not
> > standardized or implemented in the stdlib, just conventional.
> 
> yeah, i think this is a good idea. but i broke my head when trying
> to think about how to implement this.

so my problem as of right now is: i have a function, and that
function does multiple things, for example, open a bunch of
files.

it fails to open one fail with eperm and returns that error.

EPERM as a return value is absolutly not helpful because you don't
know on which file it failed. i'd wish it was somehow possible to
identify on which file it fails on, ideally keeping the "?" postfix
expression thing in place and not using match.

even if we where using match, it's kind of hard to return which
file it failed on. we could use an index to the file in the array,
but that is also kind of not nice for the caller, if it passed
those using varaadic parameters or something.

i have no idea if we can fix this to be honest.
Details
Message ID
<D3GGWAXVK5DB.1TLK8Y0TKDVS0@cmpwn.com>
In-Reply-To
<Zr3nsYvxRJtWd0NW@xha.li> (view parent)
DKIM signature
pass
Download raw message
We could have an error collector which heap allocates them (maybe in an
arena) and can be passed to functions which produce more complex errors.
Then they can record a detailed error with the collector and return a
simple error up the stack. Maybe the simple errors store a handle which
acts as a reference to the detailed error?

All of this still being domain-specific and implemented in the
application in question as a pattern rather than being something I wish
to see in the stdlib etc.
Details
Message ID
<f7d0bb4d-019f-4189-82eb-4f50f370c137@spxtr.net>
In-Reply-To
<D3DV9PJQ9MI1.3JLJK9345HAHE@cmpwn.com> (view parent)
DKIM signature
pass
Download raw message
I agree with the basic idea that strerror is limiting and not the right
choice for some (most?) modules. I also think that the stdlib's heavy
use of static buffers is a bit of a smell.


> Currently, strerror is commonly used to print an error to stderr
> before terminating the program, like so:
>
> 	fmt::fatal("Error:", strerror(err));
>
> Here's what the equivalent logic looks like with write_error:
>
> 	fmt::error("Error: "): void;
> 	write_error(os::stderr, err): void;
> 	fmt::fatal(); // print newline + terminate

This makes the situation significantly worse for the 95% of cases where
the error is a simple string literal. The strerror case is simple and
clear, but the write_error case is not. It has three different function
calls that each write to stderr!


> Also, as Seb notes, adding a dependency on io for essentially all
> modules is not something I'm totally comfortable with.

I get that reducing dependencies is good in general, but io:: is a
particularly lightweight dependency that is already used by basically
all Hare programs. It might be possible to make it even lighter by
splitting the util functions into a separate module from the io::handle
definition.


I would expect it's possible to come up with some better system that is
both flexible enough to handle complicated error messages but still
simple to use for simple errors.


Also, note that this feels very similar to some changes I tried to make
a while ago to strconv:

https://lists.sr.ht/~sircmpwn/hare-dev/%3C20240226202915.2552366-1-me@spxtr.net%3E

I would be particularly interested in some solution that works for both
errors and strconv.
Details
Message ID
<a6e9be66-8e6c-4990-ace2-fd93303e5255@app.fastmail.com>
In-Reply-To
<D3GGWAXVK5DB.1TLK8Y0TKDVS0@cmpwn.com> (view parent)
DKIM signature
pass
Download raw message
I agree with some of the nuances originally described here, but I also
don't like the complexity of the solution proposed.

A major annoyance right now is that `strerror` sometimes returns a
statically allocated string, and sometimes returns a heap allocated. It
is impossible for the caller to determine whether they must `free` the
returned string or not.

This will become an even more annoying problem when we have linear
types, given that the return value's "lifetime" or "origin" is not
static.

-- 
Hugo
Details
Message ID
<D40F2U84XI35.2V69TOTOKVJB2@sebsite.pw>
In-Reply-To
<D3DV9PJQ9MI1.3JLJK9345HAHE@cmpwn.com> (view parent)
DKIM signature
pass
Download raw message
On Mon Aug 12, 2024 at 6:38 AM EDT, Drew DeVault wrote:
> Most error handling scenarios are simple and do not suffer especially
> much from the strerror convention. Something like a compiler (e.g.
> hare-c) has much more complex error handling and reporting requirements,
> perhaps by an order of magnitude, than the standard case. As the wisdom
> goes, a convention should handle 95% of cases and the remaining 5%
> should defy the convention.
>
> I would sooner prefer that some modules or use-cases for error handling
> that exceed the bounds of strerror should establish a parallel
> convention for stringifying errors. One example of how this would look
> is that something like hare-c just handles errors in a substantially
> different manner than the convention affords. For instance, it might be
> nice to give harec-style contextual errors like so:
>
> Error: assignment to 'int' from 'char *' makes integer from pointer without a cast
> 3 |     int x = "Hello world";
> 	      ^
>
> I think it's well beyond the remit of strerror (or even the proposed
> write_error) to be doing things like opening files to fetch context
> strings. Nevertheless, it's well within the scope of a tool like hare-c
> to report errors in this manner.

Hm yeah this is a good point.

> As an alternative approach, I would suggest retaining the strerror
> convention, but taking it as a given constraint that strerror cannot
> always return the best errors. It might use a static buffer (never heap
> allocating) and truncate the error, or might simply return a constant
> string that lacks context. We can establish *alongside* strerror a
> conventional function like write_error which can represent errors with
> more detail -- supplementing strerror only when required but retaining
> strerror for the simple approach which handles 95% of cases. (I might
> also suggest wrerror as a better name than write_error if this is
> established as an ecosystem-wide convention for stronger error
> handling).

Eh, I don't think it makes much sense to have two ways to obtain error
messages, where one method is just, better than the other. If it's
possible to obtain an error message which doesn't truncate, why would
you choose the truncating method?

An ecosystem-wide convention for something like this might make sense if
it's established that modules will provide *either* strerror or
write_error, preferring strerror if there isn't a good reason to choose
something else, but like you said, some modules and use-cases have
special error handling requirements, so an ecosystem-wide convention
doesn't really make much sense in that case.

> Also, as Seb notes, adding a dependency on io for essentially all
> modules is not something I'm totally comfortable with.

Most modules already have io:: somewhere in their dependency chain, and
if they don't, most programs will have an io:: dependency anyway, so I
don't see this as much of an issue.
Reply to thread Export thread (mbox)