The align parameter allows the arena to handle any unusual alignments,
something that’s surprisingly difficult to do with libc. It’s
difficult to appreciate its usefulness until it’s convenient.
what situation will force you to have manual controle over alignment?
On Wed, 31 Jul 2024 at 15:49, [GD] Apple area
<faris2007alsulami@gmail.com> wrote:
>> ```> The align parameter allows the arena to handle any unusual alignments, something that’s surprisingly difficult to do with libc. It’s difficult to appreciate its usefulness until it’s convenient.> ```> what situation will force you to have manual controle over alignment
"[GD] Apple area" <faris2007alsulami@gmail.com> wrote:
> The align parameter allows the arena to handle any unusual alignments,> something that’s surprisingly difficult to do with libc. It’s> difficult to appreciate its usefulness until it’s convenient.> what situation will force you to have manual controle over alignment?> > > On Wed, 31 Jul 2024 at 15:49, [GD] Apple area> <faris2007alsulami@gmail.com> wrote:> >> > ```> > The align parameter allows the arena to handle any unusual alignments, something that’s surprisingly difficult to do with libc. It’s difficult to appreciate its usefulness until it’s convenient.> > ```> > what situation will force you to have manual controle over alignment
Any code that is meant to be highly performant needs to be
concerned about data alignment. For example, if you want to use
SIMD you need to have at least 16-byte alignment (higher if you
are using wider AVX instructions). As an example if you have a big
array of 4-byte floats (f32/single) and you want to process them 4
at a time you can't skip over the first 2 then process the rest
without killing your throughput.
The same thing goes for any data you want to upload to a GPU
either for bulk processing/compute or for rendering/drawing to the
screen.
Usually when you are handing out pointers from an arena there is
nothing stopping you from giving the caller a pointer that ends in
a 1. If you then tell the CPU to load anything from that pointer
that is not a 1-byte value it will generate an exception. However,
if you are not writing assembly, the compiler will generate some
slow garbage code that will do what you were asking in certain
cases where it is required by the language standard to do so.
- Randy
--
https://rnpnr.xyz/
GPG Fingerprint: B8F0 CF4C B6E9 415C 1B27 A8C4 C8D2 F782 86DF 2DC5
> what situation will force you to have manual controle over alignment?
Two situations: false sharing avoidance and SIMD loads/stores. If threads
regularly report results in their own array elements, I can make the array
64-byte aligned to guarantee elements get separate cache lines. With AVX
intrinsics I can allocate for aligned loads and stores.