(* = obviously not trivial if you're trying to minimize the memory ordering constraints, as evidenced by the revisions to this post that were required; it is trivial if you just make everything seq_cst)
Previous writings on this topic :
Smart & Weak Pointers - valuable tools for games - 03-27-04
cbloom rants 03-22-08 - 6
cbloom rants 07-05-10 - Counterpoint 2
cbloom rants 08-01-11 - A game threading model
cbloom rants 03-05-12 - Oodle Handle Table
The primary ops conceptually are :
Add object to table; gives it a WeakRef id
WeakRef -> OwningRef (might be null)
OwningRef -> naked pointer
OwningRef construct/destruct = ref count inc/dec
The full code is in here : cbliblf.zip , but you can get a
taste for how it works from the ref count maintenance code :
// IncRef looks up the weak reference; returns null if lost
// (this is the only way to resolve a weak reference)
Referable * IncRef( handle_type h )
{
handle_type index = handle_get_index(h);
LF_OS_ASSERT( index >= 0 && index < c_num_slots );
Slot * s = &s_slots[index];
handle_type guid = handle_get_guid(h);
// this is just an atomic inc of state
// but checking guid each time to check that we haven't lost this slot
handle_type state = s->m_state.load(mo_acquire);
for(;;)
{
if ( state_get_guid(state) != guid )
return NULL;
// assert refcount isn't hitting max
LF_OS_ASSERT( state_get_refcount(state) < state_max_refcount );
handle_type incstate = state+1;
if ( s->m_state.compare_exchange_weak(state,incstate,mo_acq_rel,mo_acquire) )
{
// did the ref inc
return s->m_ptr;
}
// state was reloaded, loop
}
}
// IncRefRelaxed can be used when you know a ref is held
// so there's no chance of the object being gone
void IncRefRelaxed( handle_type h )
{
handle_type index = handle_get_index(h);
LF_OS_ASSERT( index >= 0 && index < c_num_slots );
Slot * s = &s_slots[index];
handle_type state_prev = s->m_state.fetch_add(1,mo_relaxed);
state_prev;
// make sure we were used correctly :
LF_OS_ASSERT( handle_get_guid(h) == state_get_guid(state_prev) );
LF_OS_ASSERT( state_get_refcount(state_prev) >= 0 );
LF_OS_ASSERT( state_get_refcount(state_prev) < state_max_refcount );
}
// DecRef
void DecRef( handle_type h )
{
handle_type index = handle_get_index(h);
LF_OS_ASSERT( index >= 0 && index < c_num_slots );
Slot * s = &s_slots[index];
// no need to check guid because I must own a ref
handle_type state_prev = s->m_state.fetch_add((handle_type)-1,mo_release);
LF_OS_ASSERT( handle_get_guid(h) == state_get_guid(state_prev) );
LF_OS_ASSERT( state_get_refcount(state_prev) >= 1 );
if ( state_get_refcount(state_prev) == 1 )
{
// I took refcount to 0
// slot is not actually freed yet; someone else could IncRef right now
// the slot becomes inaccessible to weak refs when I inc guid :
// try to inc guid with refcount at 0 :
handle_type old_guid = handle_get_guid(h);
handle_type old_state = make_state(old_guid,0); // == state_prev-1
handle_type new_state = make_state(old_guid+1,0); // == new_state + (1<
<handle_guid_shift);
if ( s->m_state($).compare_exchange_strong(old_state,new_state,mo_acq_rel,mo_relaxed) )
{
// I released the slot
// cmpx provides the acquire barrier for the free :
FreeSlot(s);
return;
}
// somebody else mucked with me
}
}
The maintenance of ref counts only requires relaxed atomic increment & release atomic decrement (except when the pointed-at object is
initially made and finally destroyed, then some more work is required). Even just the relaxed atomic incs
could get expensive if you did a ton of them, but my philosophy for how to use this kind of system is that you inc & dec refs
as rarely as possible. The key thing is that you don't write functions that take owning refs as arguments, like :
void bad_function( OwningRefT
hence doing lots of inc & decs on refs all over the code. Instead you write all your
code with naked pointers, and only use the smart pointers where they are needed to ensure ownership for the lifetime
of usage. eg. :
<Thingy> sptr )
{
more_bad_funcs(sptr);
}
void Stuff::bad_caller()
{
OwningRefT<thingy> sptr( m_weakRef );
if ( sptr != NULL )
{
bad_function(sptr);
}
}
void good_function( Thing * ptr )
{
more_good_funcs(ptr);
}
void Stuff::bad_caller()
{
OwningRefT
<thingy> sptr( m_weakRef );
Thingy * ptr = sptr.GetPtr();
if ( ptr != NULL )
{
good_function(ptr);
}
}
If you like formal rules, they're something like this :
1. All stored variables are either OwningRef or WeakRef , depending on whether it's
an "I own this" or "I see this" relationship. Never store a naked pointer.
2. All variables in function call args are naked pointers, as are variables on the
stack and temp work variables, when possible.
3. WeakRef to pointer resolution is only provided as WeakRef -> OwningRef. Naked pointers
are only retrieved from OwningRefs.
And obviously there are lots of enchancements to the system that are possible. A major one that I recommend is to put more information in the reference table state word. If you use a 32-bit weak reference handle, and a 64-bit state word, then you have 32-bits of extra space that you can check for free with the weak reference resolution. You could put some mutex bits in there (or an rwlock) so that the state contains the lock for the object, but I'm not sure that is a big win (the only advantage of having the lock built into the state is that you could atomically get a lock and inc refcount in a single op). A better usage is to put some object information in there that can be retrieved without chasing the pointer and inc'ing the ref and so on.
For example in Oodle I store the status of the object in the state table. (Oodle status is a progression through Invalid->Pending->Done/Error). That way I can take a weak ref and query status in one atomic load. I also store some lock bits, and you aren't allowed to get back naked pointers unless you have a lock on them.
The code for the weak ref table is now in the cbliblf.zip that I made for the last post. Download : cbliblf.zip
( The old cblib has a non-LF weak reference table that's similar for comparison. It's also more developed with helpers and fancier templates and such that could be ported to this version. Download : cblib.zip )
ADDENDUM : alternative DecRef that uses CAS instead of atomic decrement. Removes the two-atomic free path.
Platforms that implement atomic add as a CAS loop should probably just use this form. Platforms that have
true atomic add should use the previously posted version.
// DecRef
void DecRef( handle_type h )
{
handle_type index = handle_get_index(h);
LF_OS_ASSERT( index >= 0 && index < c_num_slots );
Slot * s = &s_slots[index];
// no need to check guid because I must own a ref
handle_type state_prev = s->m_state($).load(mo_relaxed);
handle_type old_guid = handle_get_guid(h);
for(;;)
{
// I haven't done my dec yet, guid must still match :
LF_OS_ASSERT( state_get_guid(state_prev) == old_guid );
// check current refcount :
handle_type state_prev_rc = state_get_refcount(state_prev);
LF_OS_ASSERT( state_prev_rc >= 1 );
if ( state_prev_rc == 1 )
{
// I'm taking refcount to 0
// also inc guid, which releases the slot :
handle_type new_state = make_state(old_guid+1,0);
if ( s->m_state($).compare_exchange_weak(state_prev,new_state,mo_acq_rel,mo_relaxed) )
{
// I released the slot
// cmpx provides the acquire barrier for the free :
FreeSlot(s);
return;
}
}
else
{
// this is just a decrement
// but have to do it as a CAS to ensure state_prev_rc doesn't change on us
handle_type new_state = state_prev-1;
LF_OS_ASSERT( new_state == make_state(old_guid, state_prev_rc-1) );
if ( s->m_state($).compare_exchange_weak(state_prev,new_state,mo_release,mo_relaxed) )
{
// I dec'ed a ref
return;
}
}
}
}