I'm glad everyone's on board.
> Would we want to actually enforce that the digit separator "makes
> sense", i.e. actually separating the hundreds, thousands, etc?
Neither Rust, C/C++, nor Zig enforce separating by
hundreds/thousands/nibbles/bytes, and i don't think we should either.
> The easiest way to implement would probably be just to allow and
> ignore `_` in number literals.
That's the state my implementation is in. But after making it, i've come
up with some reasonable restrictions on where the separator may go:
1. No more than one separator may be used consecutively.
Illegal: `1__000`
2. A number may not end with a trailing separator.
Illegal: `1000_`
3. A separator may not immediately follow a number base prefix.
Illegal: `0x_ACAB`, `0b_1010`
4. A separator may not be used inside a type suffix.
Illegal: `42069u3_2`
5. A separator may not be used immediately before a type suffix.
Illegal: `1337_u64`
Of course no one would do something insane like write `1u3_2`, so the
necessity of rules 1-4 are questionable, but i think the rule 5 example
looks nicer than `1337u64`, and maybe some others would too. That would
make a style disparity that's best avoided, so i think the stricter
rules should be in place.
I'd like approval and/or feedback on the above rules, then i will submit
the patch. After that, i can update the Vim syntax myself but i'd like
someone smarter to handle the specification.
On Sat Nov 11, 2023 at 10:12 PM EST, JTurtle wrote:
> I've come up with some reasonable restrictions on where the separator may go:>> 1. No more than one separator may be used consecutively.> Illegal: `1__000`> 2. A number may not end with a trailing separator.> Illegal: `1000_`> 3. A separator may not immediately follow a number base prefix.> Illegal: `0x_ACAB`, `0b_1010`> 4. A separator may not be used inside a type suffix.> Illegal: `42069u3_2`> 5. A separator may not be used immediately before a type suffix.> Illegal: `1337_u64`
Those rules seem sane, but I don't think that we should enforce them in
the compiler if there is no "real reason" for them apart from
aesthetics. They should instead be added to the style guide.
I think the only "necessary" rule is that the number literal cannot
start with a `_`, to avoid confusion with a symbol. But even that may
not need special code. I assume (naively?) that the code can be
structured such that the token will be detected as a symbol if it starts
with a `_` but is intended to be a number literal, and the error that
the user sees will be the usual
/.../main.ha:3:27: error: Unknown object '_1234'
3 | fmt::println(_1234)!;
| ^
On Sun Nov 12, 2023 at 4:36 AM CET, Sebastian LaVine wrote:
> On Sat Nov 11, 2023 at 10:12 PM EST, JTurtle wrote:> > I've come up with some reasonable restrictions on where the separator may go:> >> > 1. No more than one separator may be used consecutively.> > Illegal: `1__000`> > 2. A number may not end with a trailing separator.> > Illegal: `1000_`> > 3. A separator may not immediately follow a number base prefix.> > Illegal: `0x_ACAB`, `0b_1010`> > 4. A separator may not be used inside a type suffix.> > Illegal: `42069u3_2`> > 5. A separator may not be used immediately before a type suffix.> > Illegal: `1337_u64`>> Those rules seem sane, but I don't think that we should enforce them in> the compiler if there is no "real reason" for them apart from> aesthetics. They should instead be added to the style guide.
The compiler should enforce these rules, via the grammar.
My first patch was rejected, which is good because i submitted it
earlier than i should. I'll wait for an authoritative "we're ready, do
it" from someone before trying that stunt again.
I found a more concise way to explain separator restrictions that covers
all the cases (i hope).
"Separators may only be placed between two valid digits, and not within
a type suffix."
Here's a list of invalid numbers according to this rule:
- 1__0
- 0x_ABACA
- 0u6_4
- 0u_64
- 100_
- 2_e8
- 2e_8
- 0x5_p3
- 20._0
- 1_.45
If you can find an unspecified corner case, please tell.
Sorry i messed up the sourcehut thread, and thank y'all for being
patient with me. I just wanted my long constants to be prettier and got
way outside my comfort zone.
this sounds good to me! btw, i don't think you were premature in sending
off a patch - it's a good way to gather more feedback, and the code is
another form of specification which can sometimes help clarify things