2/23/2013

02-23-13 - Threading Patterns - Wake Polling

Something I've written about a lot but never given a solid name to.

When a thread is waiting on some condition, your goal should be to only wake it up if that condition is actually true - that is, the thread really gets to run. In reverse order of badness :

1. Wakeup condition polling. This is the worst and is very common. You're essentially just using the thread wakeup to say "hey your condition *might* be set, check it yourself". The suspect code looks something like :


while( ! condition )
{
    Wait(event);
}

these threads can waste a ton of cycles just waking up, checking their condition, then going back to sleep.

One of the common ways to get nasty wake-polling is when you are trying to just wake one thread, but you have to do a broadcast due to the possibility of a missed wakeup (as in the naive semaphore from waitset ).

Of course any usage of cond_var is a wake-poll loop. I really don't like cond_var as an API or a programming pattern. It encourages you to write wakee-side condition checks. Whenever possible, waker-side condition checks are better. (See previous notes on cond vars such as : In general, you should prefer to use the CV to mean "this condition is set" , not "hey wakeup and check your condition").

(ADDENDUM : in fact I dislike cond_var so much I wrote a proposal on an alternative cond_var api ).

Now it's worth breaking this into two sub-categories :

1.A. Wake-polling when it is extremely likely that you get to run immediately.

This is super standard and is not that bad. At root, what's happening here is that under normal conditions, the wakeup means the condition is true and you get to run. The loop is only needed to catch the race where someone stole your wakeup.

For example, the way Linux implements semaphore on futex is a classic wake-poll. The core loop is :


    for(;;)
    {
        if ( try_wait() ) break;

        futex_wait( & sem->value, 0 ); // wait if sem value == 0
    }

If there's no contention, you wake from the wait and get to try_wait (dec the count) and proceed. The only time you have to loop is if someone else raced in and dec'ed the count before you. (see also in that same post a discussion of why you actually *want* that race to happen, for performance reasons).

The reason this is okay is because the futex semaphore only has to do a wake 1 when it signals. If it had to do a broadcast, this would be a bad loop. (and note that the reason it can do a broadcast is due to the special nature of the futex wait, which ensures that the single thread signal actually goes to someone who needs it!) (see : cbloom rants 08-01-11 - Double checked wait ).

1.B. Wake-polling when it is unlikely that you get to run.

This is the really bad one.

As I've noted previously ( cbloom rants 07-26-11 - Implementing Event WFMO ) this is a common way for people to implement WFMO. The crap implementation basically looks like this :


while ( any events in array[] not set )
{
    wait on an unset event in array[]
}

What this does is any time one of the events in the set triggers, it wakes up all the waiters that are waiting on it in an array, checks the array, and they go back to sleep.

Obviously this is terrible, it causes bad "thread thrashing" - tons of wakeups and immediate sleeps just to get one thread to eventually run.

2. "Direct Handoff" - minimal wakes. This is the ideal; you only wake a thread when you absolutely know it gets to run.

When only a single thread is waiting on the condition, this is pretty easy, because there's no issue of "stolen wakeups". With multiple threads waiting, this can be tricky.

The only way to really robustly ensure that you have direct handoff is by making the wakeup ensure the condition.

At the low level, you want threading primitives that don't give you unnecessary wakeups. eg. we don't like the pthreads cond_var that has you call :

    condvar.wait();
    mutex.lock();
as two separate calls, which means you can wake from the condvar and immediately fail to get the mutex and go back to sleep. Prefer a single call :
    condvar.wait_then_lock(mutex);
which only wakes you when you get a cv signal *and* can acquire the mutex.

At the high level, the main thing you should be doing is *waker* side checks.

eg. to do a good WFMO you should be checking for all-events-set on the *waker* side. To do this you must create a proxy event for the set when you enter the wait, register all the events on the proxy, and then you only signal the proxy when they are all set. When one of them is set, it does the checking. That is, the checking is moved to the signaller. The advantage is that the signalling thread is already running.

5 comments:

Stephan said...

What do you mean exactly with the pthreads cond_var making you call wait and lock as two separate calls? pthread_cond_wait obviously does the unlock-wait-lock in one call, but you probably mean the underlying implementation...

Regarding the minimizing of unnecessary wakes: Since a condition variable is associated with a mutex, you can get relatively far with a user-side logic like the one below:

int waitCount = 0;
int notifyCount = 0;

// Waiter
// ======
mutex.lock();
while (!condition) {
++waitCount;
condvar.wait(mutex); // unlocks, waits and locks again
if (notifyCount > 0) { // the waker decremented the waitCount
--notifyCount;
} else { // adjust waitCount for spurious wake up
--waitCount;
}
} // while
// ...
mutex.unlock();

// Waker
// ======
mutex.lock();
condition = true;
if (waitCount == 0) mutex.unlock();
else {
--waitCount;
++notifyCount;
mutex.unlock();
condvar.wake();
}

If you implement the mutex and condition variable with futexes, you could implement
a wake that requeues the condvar waiter directly to the mutex and then reverse the mutex.unlock(); condvar.wake() sequence above, which would further minimize condvar waiters waking up only to immediately wait on the mutex again.

cbloom said...

Stephan, yeah I'm being a little fast-and-loose.

The big problem is that the condvar can't assume it's protected by "mutex". If a mutex was bound to condvar, you could more easily have better implementations.

The

mutex.unlock();
condvar.wake();

is a problem; in many cases you'd prefer to wake before unlock, because that ensures that the condition you set is still set at the time you wake(). You'd also like the "wait->lock" sequence to be atomic. If you had that you could ensure that a thread that wakes from wait will see what the waker set.

Really you want unlock_wake() to be atomic. (doing unlock after wake can be a bad performance bug on some implementations).

Your code sketch looks like it's just to prevent calls to wake() when there's no waiter. Really a good implementation should do that pretty well already internally.

The spurious wake case that I'm thinking about occurs when you have multiple waiters on the same cond_var, possibly with slightly different conditions. Something like :

thread 1: (waiter)

cv.lock();
while ( count == 0 && sel != 0 )
{
cv.unlock_wait_lock();
}
count--;
// do stuff
cv.unlock();

thread 2: (waiter)
cv.lock();
while ( count == 0 && sel != 1 )
{
cv.unlock_wait_lock();
}
count--;
// do stuff
cv.unlock();

thread 3: (waker)
cv.lock();
// publish item
count++;
sel = 0 or 1;
cv.broadcast_unlock();


because of the CV API you have to wake all threads and they then check their condition to decide if the signal can go to them.

Obviously through good usage you can avoid this, but the point is that it should be easy to write code well. The way to do that is to make the wake condition part of the wait API.

cbloom said...

Of course you can do waker-side checking with the existing APIs (that's what I do). The way you do it is to use a unique event/cv for each thread's wait condition, and have the waker decide which ones of those to signal. But you shouldn't have to build all that structure yourself, cv is so close to being there already.

Stephan said...

Thanks for your reply!

For my internal Futex-based condition variable implementation, I actually restrict the condvar to a single mutex in the constructor, but you were probably thinking more about a fused mutex-condvar primitive at the OS-level.

Regarding the wake-unlock order: Even if the condition is guaranteed to be true when you call wake, that doesn't mean that the condition must be still true when a thread is actually woken up and has succeeded to take the lock (at least if you have multiple waiters). So I'm not sure how much that is an argument for wake before unlock. Note that my implementation with a separate wait and notify count won't call wake too often if a waker manages to obtain the lock again after a wake-unlock but before a previously woken-up thread manages to acquire the lock (which could happen if the waiter does both the increment and decrement of the wait count).

To guarantee that the condition is still true when a woken-up waiter obtained the lock, you'd have to prevent all spurious wakeups and the wake_unlock operation would have to atomically unlock and wake *and* give the lock to the woken up thread, i.e. atomically transfer the lock from the waker to the waiter, right?

cbloom said...

Yeah. I did a demo a while ago of an implementation does does direct handoff of the cond_var from signaller to receiver :

http://cbloomrants.blogspot.com/2011/07/07-20-11-condvar-that-actually-atomic.html

That is, the mutex & condvar are built together, and the wake-from-wait also takes the lock.

I'm not suggesting that the API should promise that, it's too hard to do in practice.

What I'm trying to say is the closer you can get to that, the less you have of "I woke up, my conditions not set, I go back to sleep". (or inside the pthreads condvar, I woke up, I can't lock the mutex, go back to sleep).

The goal is to maximize the % of times that a condition waiter just goes to sleep once. Then it wakes only once and finds its condition and proceeds.

old rants