Chris, I just finished your article on the hash map and I was curious why you didn't rotate the hash value? I see if you get to the bottom of the trie you just start linking nodes off the first child creating a linked list. I assume rotation isn't really necessary if you have a 64-bit hash map because of the diminished chance of collision? Also, you mentioned relative pointers and I was curious how you implemented them in the context of arena allocation so I looked at markov.c: #define p32store(i, p) *(i) = (p) ? (p32)(((uptr)(p) - (uptr)(i))>>2) : 0 #define p32load(i) (*(i) ? (void *)((uptr)(i) + ((uptr)*(i)<<2)) : 0) I'm assuming you are shift by two because you'll never have an alignment less than 4? I quickly checked the code and that doesn't appear to be the case. It doesn't really save any more space, it just allows you to "reach" farther with the same number of bits, no? Steadman
Good to hear from you again, Steadman! > rotate the hash value? If it's exhausted the hash — 32 levels deep — then it's a full, 64-bit collision. One of multiple so far. Rotating will not accomplish anything because it's reverting to a linked list regardless. Everything this far is rotating/shifting the same hash. Shifting is slightly simpler, so I picked that instead. NRK had initially proposed a rotation, and I did that myself until I noticed it didn't matter. > because you'll never have an alignment less than 4? Yes, relative pointers are 4-byte integers, and therefore 4-byte aligned, that point to at least 4-byte aligned objects. I shift out the always-zero lowest two bits to extend it to a 34-bit pointer that can reach 8GiB each direction. It's "saving space" in the sense that I can fit a 34-bit offset in a 32-bit integer. With some difficulty and inconvenience I could make it relative to the containing map node, which is 8-byte aligned, and double the reach again to 16GiB each direction. If I make map nodes 16-byte aligned — which is probably still reasonable — I could double it yet again to 32GiB each way, which could then access all the physical memory of a good workstation. I stopped before taking it this far due to the debugging issues mentioned in my article. In the macros notice I'm using uptr, not char*, to compute addresses. It's necessary for the shifts, but also because char* must follow pointer rules about comparing pointers from different objects. I made this mistake in an earlier experiment and quick ran afoul of GCC's optimizer. > I quickly checked the code and that doesn't appear to be the case. The arena ensures everything is properly aligned, and p32 pointers only point to two kinds of objects: "counts" and "markov" each a kind of hash trie, and therefore each pointer-aligned (8-byte aligned). That's greater than 4, so my 2-bit shift trick works. Always-zero bits are determined by min(src_align, dst_align). If a relative pointer points at string data, I'd need to ensure the data is at least 4-byte aligned. Fortunately allocating such an aligned string using an arena is a piece of cake! If you want to see the consequences of over-aggressively shifting out low bits, try changing the macros to 3-bit shifts and observe the results! It should quickly crash. Also, thanks for taking a look at my Markov Chain / relative pointer demo!
> Rotating will not accomplish anything because it's reverting to a > linked list regardless. Yup, using rotation was a rough first idea. And after thinking about it more, I also came to realize that it isn't doing anything compared to a shift. And shifting is both simpler and it communicates the "degrade into linked list" condition more clearly so that's what I use now too. And while this might be obvious, it's worth noting that unlike a (chained) hash-table, a full hash collision doesn't necessarily trigger a linear linked list search for hash-tries. In order for it to degrade, it needs to first exhaust all the nodes in that "path". That's 16 nodes if you're using a 32bit hash (assuming 4-ary trie) and 32 nodes for 64bit hash. And for this reason, if you're planning to inserts thousands of items, using a 64bit hash becomes more important for hash-tries. You could also keep count of how deep you've traversed and re-hash with a new seed upon detecting full hash collision. But I've noticed that with 64bit hashes and my synthetic tests, I couldn't trigger the linked list condition without going OOM first. But what if you're worried about malicious input? This also might be obvious, but haven't been explicitly pointed out so far, you can deal with this in the exact way hash-tables deal with it. By using a cryptographic hash function (e.g siphash) with a secure seed. So far, I'm fairly impressed with just how flexible the underlying design of hash-tries really is. It seems to be pretty easy to adapt it to many different use-cases. - NRK
Yeah, it's funny right after I sent the email to Chris I thought about it a bit more and came to the same conclusion that both of you did. Good point about collisions not degrading immediately into a linked list search, I didn't think of that. I like this data structure so much I decided to take up Chris' offer and add linked list that maintains insertion order: https://gist.github.com/dillstead/141e2e5a6f0d2e5386c3aa7d069cea9c No matter what, I wanted insertion to be constant after I added the element to the hash trie. In order to do this I had to add an additional pointer (beyond the next pointer added to each strset) and I toyed with several different ways of doing it, e.g., pass an in/out parameter to is_member. Ultimately I decided I wanted to keep the is_member interface unchanged so I wrapped is_member with code that creates one additional strset that serves as the root. The actual hash trie starts at root's 0th child. In order to eliminate an extra NULL check on insertion, I unioned the next pointer with a tail pointer to pointer which I use for constant insertion to the list. As far as I can tell, the union is being used correctly and does not exhibit undefined behavior. Feel free to point out if I'm wrong on that. I also toyed with a few ideas for iteration but ultimately settled with a set of simple macros that abstract away the additional root: #define list_begin(x) (x ? x->child : x) #define list_next(x) (x ? x->next : x) #define list_item(x) (x ? &x->key : 0) I smoke tested the code with a simple test, it worked, which is good enough for me at this point. I'm definitely going to add this to my toolbox! Shifting gears, really interesting point you bring up, Chris, regarding using uptr not char * in your relative pointer implementation. It's obvious to me that you did this for shifts given that char can be signed but running afoul of the optimizer is much more subtle. How did you figure this out? Steadman On Thu, Oct 12, 2023 at 11:40 PM NRK <firstname.lastname@example.org> wrote: > > > Rotating will not accomplish anything because it's reverting to a > > linked list regardless. > > Yup, using rotation was a rough first idea. And after thinking about it > more, I also came to realize that it isn't doing anything compared to a > shift. And shifting is both simpler and it communicates the "degrade > into linked list" condition more clearly so that's what I use now too. > > And while this might be obvious, it's worth noting that unlike a > (chained) hash-table, a full hash collision doesn't necessarily trigger > a linear linked list search for hash-tries. > > In order for it to degrade, it needs to first exhaust all the nodes in > that "path". That's 16 nodes if you're using a 32bit hash (assuming > 4-ary trie) and 32 nodes for 64bit hash. And for this reason, if you're > planning to inserts thousands of items, using a 64bit hash becomes more > important for hash-tries. > > You could also keep count of how deep you've traversed and re-hash with > a new seed upon detecting full hash collision. But I've noticed that > with 64bit hashes and my synthetic tests, I couldn't trigger the linked > list condition without going OOM first. > > But what if you're worried about malicious input? This also might be > obvious, but haven't been explicitly pointed out so far, you can deal > with this in the exact way hash-tables deal with it. By using a > cryptographic hash function (e.g siphash) with a secure seed. > > So far, I'm fairly impressed with just how flexible the underlying > design of hash-tries really is. It seems to be pretty easy to adapt it > to many different use-cases. > > - NRK
> How did you figure this out? (I just remembered I forgot to respond to this.) I wrote my first attempt in C++ using operator overloading in order to have nice ergonomics while I got the hang of relative pointers, and GCC's optimization punished me for misuse of char* pointer arithmetic: https://lists.sr.ht/~skeeto/public-inbox/%3C20230902232504.hcihm5fdw6k7kejq%40gen2.localdomain%3E#%3C20230911223809.email@example.com%3E While that was mostly a result of being cheeky in my test, using local variables rather than a proper arena, I expect it's necessary even with arenas when also using the "malloc" function attribute. It specifically declares that returned pointers are unrelated, and therefore cannot be compared using char*. So uintptr_t is really the only safe way to do it.