~sircmpwn/hare-dev

This thread contains a patchset. You're looking at the original emails, but you may wish to use the patch review UI. Review patch
9 5

[PATCH harec] type_store: allow unbounded arrays with undefined member size

Details
Message ID
<20230917102229.18447-1-autumnull@posteo.net>
DKIM signature
missing
Download raw message
Patch: +9 -9
Signed-off-by: Autumn! <autumnull@posteo.net>
---
 src/type_store.c   | 12 ++++++------
 tests/01-arrays.ha |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/type_store.c b/src/type_store.c
index bbc723c..fa97802 100644
--- a/src/type_store.c
+++ b/src/type_store.c
@@ -762,17 +762,17 @@ type_init_from_atype(struct type_store *store,
			*type = builtin_type_error;
			return (struct dimensions){0};
		}
		if (memb.size == SIZE_UNDEFINED) {
			error(store->check_context, atype->loc, NULL,
				"Type of undefined size is not a valid array member");
			*type = builtin_type_error;
			return (struct dimensions){0};
		}

		type->align = memb.align;
		if (type->array.length == SIZE_UNDEFINED) {
			type->size = SIZE_UNDEFINED;
		} else {
			if (memb.size == SIZE_UNDEFINED) {
				error(store->check_context, atype->loc, NULL,
					"Type of undefined size is not a valid array member");
				*type = builtin_type_error;
				return (struct dimensions){0};
			}
			type->size = memb.size * type->array.length;
		}
		break;
diff --git a/tests/01-arrays.ha b/tests/01-arrays.ha
index fa8769a..937b6c3 100644
--- a/tests/01-arrays.ha
+++ b/tests/01-arrays.ha
@@ -167,10 +167,10 @@ fn eval_access() void = {
};

fn reject() void = {
	// unbounded arrays of values of undefined size
	assert(compile("fn f() void = { let x = null: *[*][*]int; };")
	// indexing arrays of values of undefined size
	assert(compile("fn f() void = { let x = null: *[*][*]int; x[0]};")
		as exited == EXIT_FAILURE);
	assert(compile("fn f() void = { let x = null: *[*]fn ()int; };")
	assert(compile("fn f() void = { let x = null: *[*]fn ()int; x[0]};")
		as exited == EXIT_FAILURE);

	// assignment to array of undefined size
-- 
2.42.0

[harec/patches] build success

builds.sr.ht <builds@sr.ht>
Details
Message ID
<CVL4AMH79BUV.1G6T38VQ0UTWT@cirno2>
In-Reply-To
<20230917102229.18447-1-autumnull@posteo.net> (view parent)
DKIM signature
missing
Download raw message
harec/patches: SUCCESS in 1m15s

[type_store: allow unbounded arrays with undefined member size][0] from [Autumn!][1]

[0]: https://lists.sr.ht/~sircmpwn/hare-dev/patches/44805
[1]: autumnull@posteo.net

✓ #1058786 SUCCESS harec/patches/netbsd.yml  https://builds.sr.ht/~sircmpwn/job/1058786
✓ #1058784 SUCCESS harec/patches/alpine.yml  https://builds.sr.ht/~sircmpwn/job/1058784
✓ #1058785 SUCCESS harec/patches/freebsd.yml https://builds.sr.ht/~sircmpwn/job/1058785
Details
Message ID
<CVMBPQTMKNM8.1YSK94RS5N0R6@notmylaptop>
In-Reply-To
<20230917102229.18447-1-autumnull@posteo.net> (view parent)
DKIM signature
missing
Download raw message
Hm, I'm not sure this is a good idea.

The main use-case for slice members with undefined size is []opaque,
since []opaque has the unique property that all slice types are
assignable to it. *[*]opaque doesn't have this property, and thus
there's not much reason to use it. You can't index from it, so you'd
need to cast it into a pointer to some other array type anyway,
something which is already possible with *opaque.

I guess another way of putting it is this: it doesn't make sense to have
an array of an opaque type, since by definition the opaque type is,
well, opaque. This is true whether or not the array is bounded: an
unbounded array is still an array, just without a known length/size, but
it still doesn't make sense to store opaque data within it. [*]opaque
is, itself, a type whose representation is undefined since the data is
opaque. But that makes [*]opaque indistinguishable from opaque itself.
Case in point: bounded arrays whose member has undefined size don't make
sense, and aren't allowed by this patch. But I argue that the same
reasoning for disallowing bounded arrays here also applies to unbounded
arrays.

Opaque slices make more sense, since internally a slice just stores a
pointer, and that pointer is to opaque data. But the opaque data isn't
stored within the representation of the slice itself.

This is to say that [*]opaque doesn't fill any particular niche that
isn't already filled by opaque itself, so I think it's simpler to just
disallow it entirely.
Details
Message ID
<CVN5MMFE9K4O.1RB5L1FOGB24F@attila>
In-Reply-To
<CVMBPQTMKNM8.1YSK94RS5N0R6@notmylaptop> (view parent)
DKIM signature
missing
Download raw message
One thing that I find useful about *[*]opaque is the ability to get an
opaque slice from it via regular unbounded array slicing:

fn to_slice(data: *[*]opaque, ndata: size) = {
	return data[..ndata];
};

Equivalent code without this patch involves quite a bit of casting gymnastics.

Additionally, the distinction between an opaque unbounded array and an opaque
pointer carries semantic information that may be very useful to the programmer.
Details
Message ID
<CVN85LI7WRKZ.3OK4STVOR0BOK@notmylaptop>
In-Reply-To
<CVN5MMFE9K4O.1RB5L1FOGB24F@attila> (view parent)
DKIM signature
missing
Download raw message
On Tue Sep 19, 2023 at 3:51 PM EDT, Bor Grošelj Simić wrote:
> One thing that I find useful about *[*]opaque is the ability to get an
> opaque slice from it via regular unbounded array slicing:
>
> fn to_slice(data: *[*]opaque, ndata: size) = {
> 	return data[..ndata];
> };
>
> Equivalent code without this patch involves quite a bit of casting gymnastics.

I'm still not convinced tbh. It's only one additional cast, which is
hardly "casting gymnastics".

I'd also be interested in seeing real-world examples of places where
*[*]opaque would be useful. The one example I've seen is hare-raylib,
where pointers are sometimes converted to arrays (correct me if I'm
wrong here). But I'm not convinced this is correct anyways (I say this
not having used hare-raylib, so take this with a grain of salt), nor do
I think *opaque is problematic here. Any other examples you know of?

> Additionally, the distinction between an opaque unbounded array and an opaque
> pointer carries semantic information that may be very useful to the programmer.

Eh, I guess? I'm still not convinced: "array of opaque data" still
doesn't make sense, and if the data being pointed to is opaque then the
array doesn't convey much additional information that can't be conveyed
through context or documentation. I do see the argument here though.
This also would be a more convincing argument if there were other
real-world examples where the distinction is helpful (besides
hare-raylib, which I'm not dismissing or anything, I'd just like to see
other examples).
Details
Message ID
<CVNAXHJMYGVZ.2Z4ZFZV6PS8I3@attila>
In-Reply-To
<CVN85LI7WRKZ.3OK4STVOR0BOK@notmylaptop> (view parent)
DKIM signature
missing
Download raw message
On Tue Sep 19, 2023 at 11:50 PM CEST, Sebastian wrote:
> On Tue Sep 19, 2023 at 3:51 PM EDT, Bor Grošelj Simić wrote:
> > One thing that I find useful about *[*]opaque is the ability to get an
> > opaque slice from it via regular unbounded array slicing:
> >
> > fn to_slice(data: *[*]opaque, ndata: size) = {
> > 	return data[..ndata];
> > };
> >
> > Equivalent code without this patch involves quite a bit of casting gymnastics.
>
> I'm still not convinced tbh. It's only one additional cast, which is
> hardly "casting gymnastics".

Err, maybe I missed the optimal way to do that? What do you suggest using?

> I'd also be interested in seeing real-world examples of places where
> *[*]opaque would be useful. The one example I've seen is hare-raylib,
> where pointers are sometimes converted to arrays (correct me if I'm
> wrong here). But I'm not convinced this is correct anyways (I say this
> not having used hare-raylib, so take this with a grain of salt), nor do
> I think *opaque is problematic here. Any other examples you know of?

I had qsort and related functions in mind:

fn qsort(base: *[*]opaque, nmemb: size, sz: size, compar: *fn(...)) void;

I suppose interfacing directly with libc is not very common thing in hare, but
by Greenspun's tenth rule there are a lot of various somewhat generic data
structures out there, many of them using void * for things that are obviously
arrays.

>
> > Additionally, the distinction between an opaque unbounded array and an opaque
> > pointer carries semantic information that may be very useful to the programmer.
>
> Eh, I guess? I'm still not convinced: "array of opaque data" still
> doesn't make sense, and if the data being pointed to is opaque then the
> array doesn't convey much additional information that can't be conveyed
> through context or documentation. I do see the argument here though.
> This also would be a more convincing argument if there were other
> real-world examples where the distinction is helpful (besides
> hare-raylib, which I'm not dismissing or anything, I'd just like to see
> other examples).

I think the main reason why I see this as potentially useful is that low level
bindings for C code are often not going to have their own documentation, so
seeing *[*]opaque is comparatively more valuable because the alternative is
opening the docs of the original C project (which the author of the bindings
has no control over).
Details
Message ID
<CVNB9JCNXNGT.242U6LV8FNE4O@notmylaptop>
In-Reply-To
<CVNAXHJMYGVZ.2Z4ZFZV6PS8I3@attila> (view parent)
DKIM signature
missing
Download raw message
On Tue Sep 19, 2023 at 8:01 PM EDT, Bor Grošelj Simić wrote:
> On Tue Sep 19, 2023 at 11:50 PM CEST, Sebastian wrote:
> > On Tue Sep 19, 2023 at 3:51 PM EDT, Bor Grošelj Simić wrote:
> > > One thing that I find useful about *[*]opaque is the ability to get an
> > > opaque slice from it via regular unbounded array slicing:
> > >
> > > fn to_slice(data: *[*]opaque, ndata: size) = {
> > > 	return data[..ndata];
> > > };
> > >
> > > Equivalent code without this patch involves quite a bit of casting gymnastics.
> >
> > I'm still not convinced tbh. It's only one additional cast, which is
> > hardly "casting gymnastics".
>
> Err, maybe I missed the optimal way to do that? What do you suggest using?

fn to_slice(data: *opaque, ndata: size) []opaque = {
	// []u8 is assignable to []opaque
	return (data: *[*]u8)[..ndata];
};

Or, more realistically, if you know the actual length of the array, you
most likely know the actual type as well (or at least the type you want
to use to interpret the data):

let data = (data: *[*]u32)[..ndata]; // []u32

> I had qsort and related functions in mind:
>
> fn qsort(base: *[*]opaque, nmemb: size, sz: size, compar: *fn(...)) void;
>
> I suppose interfacing directly with libc is not very common thing in hare, but
> by Greenspun's tenth rule there are a lot of various somewhat generic data
> structures out there, many of them using void * for things that are obviously
> arrays.

Makes sense. I still think it's fine to use *opaque here (at least in
the specific example you gave), but I see the value [*]opaque would
provide.

> I think the main reason why I see this as potentially useful is that low level
> bindings for C code are often not going to have their own documentation, so
> seeing *[*]opaque is comparatively more valuable because the alternative is
> opening the docs of the original C project (which the author of the bindings
> has no control over).

I'd argue that if the bindings don't have docs, you should be using the
original docs anyway.

But that said, I do see where [*]opaque would be used within C bindings.
I still don't think that's enough of a justification for special casing
unbounded arrays like this. [*]opaque wouldn't add any new
functionality, which is fine if there's a good reason for it to exist,
but I don't think C interop is on its own a compelling enough reason. I
won't die on this hill though.
Details
Message ID
<9a623e05-2b0e-4f02-a92a-2f2667cbab4a@posteo.net>
In-Reply-To
<CVNB9JCNXNGT.242U6LV8FNE4O@notmylaptop> (view parent)
DKIM signature
missing
Download raw message
so, my main reasons for this were:

marking up C-style array pointers with more useful information, as bgs 
mentioned
and then secondly it just seems unnecessary to forbid it. this patch 
just makes it so that the restriction of "you can't instantiate a type 
of undefined size" is only applied where necessary. in fact honestly 
e.g. *[4]opaque could be allowed too, since it's not doing any harm as a 
pointer, and that would remove the special case for just bounded array 
pointers. i hadn't really considered that part when i sent this patch.

the nice thing about pointers is that they're just an address to the 
start of an item, so you don't have to care about the size of the item, 
and the type checking rules for pointers can ignore undefined size -- 
this is the premise of *opaque. i think it makes sense to generalize 
this rule to pointers to items of undefined size, rather than 
special-casing opaque.

~Autumn
Details
Message ID
<CVQIVQPZLKBC.22PIU7M2JPRB0@notmylaptop>
In-Reply-To
<9a623e05-2b0e-4f02-a92a-2f2667cbab4a@posteo.net> (view parent)
DKIM signature
missing
Download raw message
On Sat Sep 23, 2023 at 10:37 AM EDT, Autumn! wrote:
> and then secondly it just seems unnecessary to forbid it. this patch 
> just makes it so that the restriction of "you can't instantiate a type 
> of undefined size" is only applied where necessary. in fact honestly 
> e.g. *[4]opaque could be allowed too, since it's not doing any harm as a 
> pointer, and that would remove the special case for just bounded array 
> pointers. i hadn't really considered that part when i sent this patch.

By this logic we should also allow struct fields to have undefined size,
so long as the struct is only used indirectly. But I don't think we
should do that.

> the nice thing about pointers is that they're just an address to the 
> start of an item, so you don't have to care about the size of the item, 
> and the type checking rules for pointers can ignore undefined size -- 
> this is the premise of *opaque. i think it makes sense to generalize 
> this rule to pointers to items of undefined size, rather than 
> special-casing opaque.

opaque isn't special-cased anywhere: the only restrictions for pointers
are that the type's size must be non-zero (so it may be undefined) and
it can't be never. The restriction here is that array members must have
definite size.
Details
Message ID
<CVRLIZG5QDTQ.1NOL55YYO6JG@monch>
In-Reply-To
<20230917102229.18447-1-autumnull@posteo.net> (view parent)
DKIM signature
missing
Download raw message
tentative -1, i think hare-raylib should just use *opaque everywhere.
you don't really get much benefit from using *[*]opaque, and it's not
really possible to decide which to use programmatically
Reply to thread Export thread (mbox)