7/29/2011

07-29-11 - Semaphore Work Counting issues

Say you are using something like "fastsemaphore" to count work items that you have issued to some worker threads. Some issues that you may want to consider :

1. "thread thrashing". Say you are issuing several work items, so something like :

queue.push();
sem.post():

queue.push();
sem.post():

queue.push();
sem.post():
this can cause bad "thread thrashing" where a worker thread wakes up, does a work item, sees the queue is empty, goes back to sleep, then you wake up, push the next item, wake the worker, etc.. This can happen for example if the work items are very tiny and your work issuer is not getting them out fast enough. Or it can happen just if the worker has >= the priority of the issuer, especially on Windows where the worker may have some temporary priority boost (for example because it hasn't run in a while), then your sem.post() might immediately swap out your thread and swap in the worker, which is obviously very bad.

The solution is just to batch up the posts, like :

queue.push();
queue.push();
queue.push();

sem.post(3):
note that just calling post three times in a row doesn't necessarily do the trick, you need the single post of 3.

2. If you have several workers and some are awake and some are asleep, you may wish to spin a bit before waking the sleeping workers to see if the awake ones took the work already. If you don't do this, you can get another type of thrashing where you post a work, wake a worker, a previously running worker finishes his job and grabs the new one, now the newly awake worker sees nothing to do and goes back to sleep.

You can handle this by spinning briefly between the sem increment and the actual thread waking to see if someone grabs it in that interval. Note that this doesn't actually fix the problem of course, because this is an inherent race situation. Because the thread wake takes a long time, it is still quite possible that the work queue is empty by the time the new worker wakes up. (to do better you would have to have more information about how long the work item takes, what other work there is to do, etc.)

3. A related case is when a worker sees no work to do and is thinking about going to sleep; he can spin there between seeing the queue empty and actually sleeping to see if some work becomes available during that interval.

I should note that this kind of spinning for optimization is not an unambiguous win, and it's very hard to really measure.

In benchmark/profiling scenarios it can seem to increase your performance a lot. But that's a bit unrealistic; in benchmark scenarios you would do best by giving all your threads infinite priority and locking them to cores, and never letting them go to sleep.

Basically the spinning in these cases takes away a bit of time from other threads. Depending on what other threads you have to run, you can actually hurt performance.

No comments:

old rants