In my language, when performing monomorphization on generic structures
and functions, the compiler generates identifiers with names which are
encoded in such a way where they can be decoded to get the original
types back.
For example, a type `Foo<i32, f32>` may encode to `Foo.0.11.13.1` where:
Foo => the struct name
0 => start of a generic scope
11 => word (i32)
13 => single (f32)
1 => end of a generic scope
The problem is that, take for example this generic function over T and U
(where T = Triple<i32, i32, i32> * and U = Triple<i32, i32, i32> *[]):
fn HashMap::with_capacity<Triple<i32, i32, i32> *, Triple<i32, i32, i32>
*[]>(i32 initial_capacity);
In IR this would generate:
function l
$HashMap.with_capacity.0.2.Triple.0.11.11.11.1.2.Array.0.2.Triple.0.11.11.11.1.1.1(w
%initial_capacity.6152) {
This exceeds the 80 character limit for identifiers that `all.h` defines
for NString (it just so happens to be 81 characters long):
```h
enum {
NString = 80,
NIns = 1 << 20,
NAlign = 3,
NField = 32,
NBit = CHAR_BIT * sizeof(bits),
};
```
Possible solutions for the front-end could be to make aliases, where
something like `100` maps to `Triple.0.11.11.11.1` but this breaks the
philosophy I had that generics can be uniquely encoded and decoded when
performing monomorphization, because this would require a lookup map in
the compiler and it would be a lot more complicated. Even then, this
just postpones the problem. The only way to really solve this is to
create a lookup map which maps entire generic structs to a unique ID,
but this idea again breaks the simplicity.
Is it possible to increase this identifier cap from 80 characters? I
understand why it's there (identifiers are static arrays of this length,
80 is the standard width of a terminal) but I see no reason for it to be
so small. Even 160 would be much more useful than 80.
If possible, it could be behind a compilation flag. QBE doesn't really
have many compilation flags excluding the very basic ones, so it may
pollute that principle of simplicity to introduce a whole flag for such
a minor difference, which leads me to believe that just fully increasing
the capacity is not the worst idea.
Sidenote, I did try to compile QBE with an increased identifier length
cap of 320 characters. All of the tests with `make check` passed. GCC
seems to also have no limit for the length symbols in objects, so a
program with a much longer length compiles and runs fine.
> Is it possible to increase this [NString] cap from 80 characters?
On Fri, Jan 3, 2025 at 6:28 PM John Nunley <dev@notgull.net> wrote:
>
> +1 to this. I encountered this issue during monomorphization of Rust
> types in Dozer.
Just note that QBE "named" structures are coded to include the max
NString size in the struct itself (always).
https://c9x.me/git/qbe.git/tree/all.h#n237
https://c9x.me/git/qbe.git/tree/all.h#n322
... etc.
Just bumping up NString right now will inflate QBE memory usage.
I don't see why indirection with PFn allocation of "name"s wouldn't
suffice, and possibly be better in general?
Look forward to the patch :P
R
On Sun, Dec 29, 2024, at 23:12, Rosie wrote:
> Is it possible to increase this identifier cap from 80 characters?
Yes, you can unblock yourself by bumping it; everything will work
the same. As Roland said, you may simply observe additional mem
consumption.
> I understand why it's there (identifiers are static arrays of this length,
> 80 is the standard width of a terminal) but I see no reason for it to be
> so small. Even 160 would be much more useful than 80.
You can 'git blame' the NString definition for history :).
Some strings are already dynamically sized. We should probably
just have more of them.