So eg. an allocation of 4000 bytes is aligned to 2048. It means you can do things like just malloc a 128-bit vector and it's aligned and you never have to worry about it. You never have to manually ask for alignment as long as the alignment you want is <= the size of your object (which it almost always is).
eg. if you want to malloc some MAT4x4 objects, you just do it and you know that they are aligned to sizeof(MAT4x4).
Is there any disadvantage to this? (eg. does it waste a lot of memory compared to more conservative alignment schemes?)
Also, I used to always do my malloc debug tracking "intrusively" , that is by sticking an info header at the front of the block and allocating and bigger piece for each alloc, then linking them together. The advantage of this is that it's very fast - when you free you just go to (ptr - sizeof(info_header)).
I think I am now convinced that that is the wrong way to do it. It's better to have a separate tracker which hashes from the pointer address to an info struct. The big advantage of the "non-intrusive" way like this is that it doesn't change the behavior of the allocator at all. So things like alignment aren't affected, and neither is cache usage or optimization issues (for example if you're using a GC-type arena allocator and adjacency of items is important to performance).
In general now I'm more eager for debugging and instrumentation schemes like this which have *zero* affect on the behavior of the core functionality, but basically just watch it from the outside.
(For example on consoles where you have 256M of memory in the real consoles and an extra 256M in the dev kits, it's ideal to make two separate allocators, one in the lower 256 where all your real data goes and one for the upper 256 where all your non-intrusive debug extra data goes; in practice this is a pain in the butt, but it is the ideal way to do things, so that you have the real final version of the game running all the time in the lower 256).