~skeeto/public-inbox

3 2

hash-trie usage out in the wild

Details
Message ID
<ejjuqlfdcp7uwo7pstbyacbbhbl4nevzlkes6wrwgahp6moxyo@vazdu7icouag>
DKIM signature
pass
Download raw message
Hi Chris,

I was just skimming over the latest rad debugger/linker (they're both
hosted in the same repo) changelog when I noticed this:

	https://github.com/EpicGamesExt/raddebugger/releases/tag/v0.9.14-alpha
	> Switched symbol table data structure to hash trie map.

Wondering if it's the same hash-trie we use, I dug into the code:

	https://github.com/EpicGamesExt/raddebugger/blob/b99e10224f7e85edb148f789eb4b95709b986d80/src/linker/lnk_symbol_table.c#L391

Turns out it *is* the same construct (it seems to be using the atomic
version too). Cool to see hash-tries gaining popularity outside our
small bubble. Thought you'd be interested in this, thus sharing :)

- NRK
Details
Message ID
<20241116173833.mcjjz3wrdgvm4bdt@nullprogram.com>
In-Reply-To
<ejjuqlfdcp7uwo7pstbyacbbhbl4nevzlkes6wrwgahp6moxyo@vazdu7icouag> (view parent)
DKIM signature
missing
Download raw message
Interesting! Hash tries have proven so useful, particularly with arena 
allocation, that it *should* be more widespread! I might say it's even 
inevitable. They even used our "child[N]" naming convention.

Rollback is a notable divergence. In our version, when losing the insert 
race the arena is simply reverted. They're using a Fleury Arena, as I'll 
call it, and reverting isn't straightforward, so they toss the discarded 
node on a free list instead.

The way we build hash tries, including the RAD folks, is sensitive to poor 
hash functions. The usual naive hash functions people typically use (e.g. 
the Java string hash) results in worst case behavior. High bits need to be 
mixed reasonably well. I wanted to look at that, and they're using a super 
optimized xxHash, so no problems in that area. Even more, when hashing 
strings they prefix the contents with its length in the hash input. I do 
not expect it makes a practical difference for hash tries, but something 
to investigate.

Reading through more of the code, I'd like to work on software this well 
written all the time. Maybe I should apply to work for RAD…
Details
Message ID
<3C41N9E7OMBC6.3CN6WDNVABMY0@rnpnr.xyz>
In-Reply-To
<20241116173833.mcjjz3wrdgvm4bdt@nullprogram.com> (view parent)
DKIM signature
permerror
Download raw message
Christopher Wellons <wellons@nullprogram.com> wrote:
> Reading through more of the code, I'd like to work on software this well
> written all the time. Maybe I should apply to work for RAD…

In case you didn't try compiling them each of the tools are single
translation units. I think its pretty neat that there are other
people out there who care enough about code quality that they can
write very complex graphical applications which only need a single
compiler invocation to build.

I'm acutally a little suprised you didn't know about them
previously, a lot of the techniques you have talked about are
closely mirrored by them and people who have worked for them (eg.
Casey Muratori who runs the "Computer, Enhance!" blog).

Another thing they do which I find very useful for writing
multiplatform code (or code with optional runtime libraries) is
using macros for function prototypes. For example:

#define PLATFORM_WRITE_FN(name) size name(iptr handle, void *data, size length, size offset)
typedef PLATFORM_WRITE_FN(platform_write_fn);

This has two benefits: 1. if you need to change the parameters you
only need to change them in one place. 2. if you need to take a
pointer to the function (maybe for a callback) you know that all
of the platform write functions in your code will be compatible
with the platform_write_fn pointer type.

You can also easily define a stub function if this was for an
optional runtime feature (I do this with cuda code paths in a GPU
computation program I wrote for my research lab).

--
https://rnpnr.xyz/
GPG Fingerprint: B8F0 CF4C B6E9 415C 1B27 A8C4 C8D2 F782 86DF 2DC5
Details
Message ID
<20241117152546.bfxpkx246za2ep4m@nullprogram.com>
In-Reply-To
<3C41N9E7OMBC6.3CN6WDNVABMY0@rnpnr.xyz> (view parent)
DKIM signature
missing
Download raw message
Yup, I knew about the jumbo build, and I've seen the clip demonstrating 
that re-building RAD Debugger from source is faster than simply loading 
Visual Studio. I've learned a ton from RAD indirectly via Handmade Hero, 
and it's no coincidence my techniques mirror theirs.

It was striking to examine actual RAD code, particularly when it's in my 
favorite area, systems programming. HH has been on hiatus for nearly two 
years, and I haven't looked at that code in at least that long. So I got 
used to "normal" systems code, especially hacking on GDB, Binutils, GCC, 
etc. for w64dk. The differences at the source level between RAD Debugger 
and GDB are just remarkable. (And even greater compared to the stuff I'm 
paid to work on.)
Reply to thread Export thread (mbox)