~sircmpwn/public-inbox

1

Data map keys in BARE

Details
Message ID
<D8YOS7L40QPS.1YI0I2DK96IQB@runxiyu.org>
Sender timestamp
1743883173
DKIM signature
pass
Download raw message
Hi,

On 16 March 2025, Drew DeVault wrote:
> Any map key type (directly or via a user-defined type) MUST be of
> a primitive type that is not f32, f64, data, data[length], or
> void.

Is there any reason to allow `data` and `data[length]` as map key types,
but to allow `str`?  I encountered this when I was trying to map Git
object IDs, which are `data[20]` when using classical SHA-1 hashes; and
when communicating environment variable keys, which are not guaranteed
to be UTF-8.

(If the reason is Go, then (1) Go does allow comparing arrays, and (2)
Go's byte slices may be compared, and therefore transitively used as map
keys, by converting them into a Go string, since comparing strings in Go
is simply comparing the underlying bytes after comparing the lengths
(note that len(string) is in bytes, not codepoints).)

Thanks!

Best regards,
Runxi Yu
Details
Message ID
<Z/dMZMBmS7+rtmOf@gmail.com>
In-Reply-To
<D8YOS7L40QPS.1YI0I2DK96IQB@runxiyu.org> (view parent)
Sender timestamp
1744267396
DKIM signature
pass
Download raw message
> > Any map key type (directly or via a user-defined type) MUST be of
> > a primitive type that is not f32, f64, data, data[length], or
> > void.
> 
> Is there any reason to allow `data` and `data[length]` as map key types,
> but to allow `str`?  I encountered this when I was trying to map Git
> object IDs, which are `data[20]` when using classical SHA-1 hashes; and
> when communicating environment variable keys, which are not guaranteed
> to be UTF-8.

This restriction is in BARE since the beginning and I only found two
relevant discussions -- first about the reason of distinction between
primitive/aggregate data types [1]:

> To make it more likely that implementations will be able to use
> language-native hashable types as map keys.

and when we forbid f32/f64 as map keys and considered the same with str
because UTF-8 is not canonical [2]:

> I don't think so. I am not aware of programming languages that
> normalize strings for maps' keys. Binary comparison is used.

I like the idea of allowing data/data[length] as map keys, because sure
it's comparable to str. Moreover, it will be simpler to user-define hash
type and then use it as map key.

Are there any objections? Did I miss something?

Thank you for your feedback!

[1]: https://lists.sr.ht/~sircmpwn/public-inbox/%3CC3NSE1PQKH0L.2GV2FPU59J4MK@ashryn%3E#%3CC3NTRVWDAHAG.1UINBXZ76F9WF@homura%3E
[2]: https://lists.sr.ht/~sircmpwn/public-inbox/%3CCAFFTG-a-Vci+kS_d_=uaX8kyxszqtB-79pcb5580Z9xw72V5kw@mail.gmail.com%3E#%3C4856d594-05b4-80f9-0665-12c2bb8f7937@elvinger.fr%3E
Reply to thread Export thread (mbox)