The following godbolt link worked in testing, but I may have added
a bug between when I wrote it and when I put the code on github.
https://godbolt.org/z/MhY19ca5o
To answer some of your questions:
> so then how could the thunk work?
The NESTED_UPGRADE macro and struct definition make this work.
I'm treating the thunk as just another object and reading the data
from it. The thunk itself should never be executed so it's safe to
mark the stack as non-executable.
> Further, since __jmp is defined in that macro-level block scope,
isn't it out of scope and therefore dead before it's used?
No, because taking the address of a nested function keeps it in
scope. This is the same trick that makes gcc "lambdas" work.
> I suppose the xsetjmp thunk could restore the stack pointer when
called, and it could find its closure context using rip-relative
addressing (as usual).
In my testing gcc was smart enough to restore everything back to how
it was. Aside from the odd stuff I'm doing to avoid needing an
executable stack, everything else is stuff that gcc expects to work
with nested functions, including jumping in and out of them.