3/22/2008

03-22-08 - 1

Jordan Chavez wrote me to point out the Weak Pointer implementation I have in cblib is pretty dumb. Yeah, he's right. Let me explain a bit. I wrote an article on smart & weak pointers long ago. I think they're rad. I'm going to assume you have an intrusive refcounted because it's just so superior to nonintrusive.

Now so far as I can tell there are two major styles of weak pointer. One is where the WeakPtr actually holds a pointer straight to the object and gets cleaned up intrusively when the object dies. The other is where the WeakPtr has some kind of indirect handle to the object which becomes an invalid handle when the object goes away.

The first way where the WeakPtr actually points at the object has one big advantage : pointer accesses are very fast, in fact they're free. There's no need to look up a handle, if you have a pointer you just use it. Of course it is slower to destruct an object, because you have to clean up your weak pointers. There are various ways to implement this, but they pretty much all have pretty high memory use overhead. My Galaxy3 has a WeakPtr of this style, implemented using a doubly linked list to clean up the weaks.

The second way with a handle can be very compact in memory, but has the disadvantage of dirtying cache for any weak lookup because you have to go through an extra indirection. The implementation in cblib is based on what I did for Oddworld, and it's intended to be as small as possible in a fixed-memory console environment. On the console, the WeakPtr is 4 bytes per WeakPtr + 6 bytes per object, but you are limited to 64k objects max. The version in cblib is just a lazy port of that to 32bit indexes so it's got 8 byte WeakPtrs which is a waste. Note because of the GUID system in the cblib version, weak table entries are reused right away, so you only need as many weak table entries as you have objects, so I can statically allocated a 64k-element weak reference table.

Jordan's suggestion is basically like this :

WeakPtr
{
	int index; // index to weak table
};

WeakTableEntry
{
	RefCounted * ptr;
	short weakCount;
};
To use the WeakPtr you just look it up in the table with [index] and use the [ptr] there. When an object dies, you set its [ptr] to NULL and leave the entry in the weak table. The [weakCount] counts how many weak pointers point at that slot so you don't reuse the slot until the weak pointers go away.

WeakPtr can also update themselves for free; if you use the [index] to look up the table and you find a [ptr] that's NULL, then you just change your index to NULL and decrement [weakCount]. This is a lazy way of eventually releasing all the weak references to a slot.

In theory there are pathological things you could do to have a huge amount of unnecessary weak table entries, but that's very unlikely in practice. This method is 4 bytes for a weak pointer and 6 bytes per weak table entry - but you can't just statically allocate a 64k entry weak table for a max object count of 64k because there will be empties in your weak table.

Smart Pointers are falling a bit out of favor (not that they ever were IN favor) because of multi-threading. Multi-threading is ubiquitous in games these days so we have to handle it. Passing refcounted objects around between threads is totally fine - it just means you need to thread-protect all your ref access. This also means thread protecting the smart pointer constructor & destructor that make the ref count go up and down, as well as the weak pointer's accesses to the weak table. This can be a bit of a performance hit. In fact the weak table is especially ugly because it's memory that all your processors would have to share. The linked-list style weak pointer is actually better for multiprocessor memory access, assuming that object lookup is far more common than object deletion.

Now, this doesn't have to be a big performance problem, because I've always been a fan of the "ref holder" style of smart pointer use, rather than the full Herb Sutter exception-safe "everything takes smart pointers" style. The ref holder style looks like :

void MyFunc()
{
	SmartPtr sp = world.GetSomeObject();
	thingy * ptr = sp.Get();
	...
	... do stuff on "ptr" ...
	...
}
basically we're just holding sp in the function scope to hold a ref, but all our work is on a plain old naked pointer "ptr". The sp would cause one interlock to take a ref at the top and release one at the bottom, no biggy unless your function is tiny.

There are surely other threading issues that maybe people with more experience can chime in on.

For example, to be thread safe, there should not be any "Get()" function on the WeakPtr to return a naked pointer, you can only return a smart pointer and have an IsNull check, to avoid the cases where your WeakPtr returns a pointer but the object gets deleted by another thread before you use it.

No comments:

old rants