9/21/2010

09-21-10 - Waiting on Thread Events Part 2

So we've seen the problem, let's look at some solutions.

First of all let me try to convince you that various hacky solutions won't work. Say you're using Semaphores. Instead of doing Up() to signal the event, you could do Increment(1000); That will cause the Down() to just run through immediately until it's pulled back down, so the next bunch of tests will succeed. This might in fact make the race never happen in real life, but the race is still there.

1. Put a Mutex around the Wait. The mutex just ensures that only one thread at a time is actually in the wait on the event. In particular we do :


WaitOnMessage_4( GUID )
{

    {
        MutexInScope(ReceiveMutex);
        pop everything on the receive FIFO
        mark everything I received as done
        if ( the GUID I wanted is done )
            return;
    }

    for(;;)
    {
        MutexInScope( WaiterMutex );

        {
            MutexInScope(ReceiveMutex);
            pop everything on the receive FIFO
            mark everything I received as done
            if ( the GUID I wanted is done )
                return;
        }

        Wait( ReceiveEvent );
    }
}

note that the WaiterMutex is never taken by the Worker thread, only by "main" threads. Also note that it must be held around the whole check for the GUID being done or you have another race. Also we have the separate ReceiveMutex for the pop/mark phase so that when we are in the wait we don't block other threads from checking status.

Now when multiple threads try to wait, the first will get in the mutex and actually wait on the Event, the second will try to lock the mutex and stall out there and wait that way.

The problem with this is that threads that can proceed don't always get to. It's not broken, you will make progress, but it's nowhere near optimal. Consider if N threads come in and wait on different GUIDs. Now the worker finishes something, so that wakes up a Wait and he goes around the loop which releases the Mutex - if you just so happened to wake up the one guy who could proceed, then it's great, he returns, but if you woke up the wrong guy, he will take the mutex and go back to sleep. (also note when the mutex unlocks it might immediately switch thread execution to one of the people waiting on locking it because of the Window fair scheduling heuristics). So you only have a 1/N chance of making optimal progress, and in the worst case you might make all N threads wait until all N items are done. That's bad. So let's look at other solutions :

2. Put the Event/Semaphore on each message instead of associated with the whole Receive queue. You wait on message->event , and the worker signals each message's event when it's done.

This also works fine. It is only race-free if GUIDs can only be used from one thread, if multiple threads can wait on the same GUID you are back to the same old problem. This also requires allocating a Semaphore per message, which may or may not be a big deal depending on how fine grained your work is.

On the plus side, this lets the Wait() be on the exact item you want being done, not just anything being done, so your thread doesn't wake up unnecessarily to check if its item was done.


WaitOnMessage_5A( GUID )
{
    Event ev;

    {
        MutexInScope(ReceiveMutex);
        pop everything on the receive FIFO
        mark everything I received as done
        if ( the GUID I wanted is done )
            return;

        ev = message(GUID)->event
    }

    Wait( ev );
}

we're glossing over the tricky part of this one which is the maintenance of the event on each message. Let's say we know that GUIDs are only touched by one thread at a time. Then we can do the event destruction/recycling when we receive a done message. That way we know that event is always safe for us to wait on outside of the mutex because the only person who will ever Clear or delete that event is us.

I think you can make this safe for multiple threads thusly :


WaitOnMessage_5B( GUID )
{
    Semaphore ev;

    {
        MutexInScope(ReceiveMutex);
        pop everything on the receive FIFO
        mark everything I received as done
        if ( the GUID I wanted is done )
            return;

        message(GUID)->users ++;
        ev = message(GUID)->semaphore;
    }

    Down( ev );

    {
        MutexInScope(ReceiveMutex);
        pop everything on the receive FIFO
        mark everything I received as done
        ASSERT( the GUID I wanted is done );

        message(GUID)->users --;
        if ( message(GUID)->users == 0 )
            delete message(GUID)->semaphore;
    }
}

and you make worker signal the event by doing Up(1000); you don't have to worry about that causing weird side effects, since each event is only ever signalled once, and the maximum number of possible waits on each semaphore is the number of threads. Since the semaphore accesses are protected by mutex, I think you could also do lazy semaphore creation, eg. don't make them on messages by default, just make them when somebody tries to wait on that message.

3. Make the Event per thread. You either put the ReceiveEvent in the TLS, or you just make it a local on the stack or somewhere for each thread that wants to wait. Then we just use WaitOnMessage_3 , but the ReceiveEvent is now for the thread. The tricky bit is we need to let the worker thread know what events it needs to signal.

The easiest/hackiest way is just to have the worker thread always signal N events that you determine at startup. This way will in fact work fine for many games. A slightly cleaner way is to have a list of the events that need signalling. But now you need to protect access to that with a mutex, something like :


WaitOnMessage_6( GUID , LocalEvent )
{
    {
    MutexInScope(EventListMutex)
        add LocalEvent to EventList
    }

    for(;;)
    {
        {
        MutexInScope(ReceiveMutex);
        pop everything on the receive FIFO
        mark everything I received as done
        if ( the GUID I wanted is done )
            return;
        }

        Wait( LocalEvent );
    }

    {
    MutexInScope(EventListMutex)
        remove LocalEvent from EventList
    }

}

and worker does :

    do work
    push message to receive FIFO
    
    {
    MutexInScope(EventListMutex)
        signal everything in EventList
    }

one nice thing about this is that you can push GUID with LocalEvent and then only signal events that want to hear about that specific message.

Note with the code written as-is it works but has very nasty thread thrashing. When the worker signals the events in the list, it immediately wakes up the waiting threads, which then try to lock the EventListMutex and immediately go back to sleep, which wakes the worker back up again. That's pretty shitty so a slightly better version is if the worker does :


    do work
    push message to receive FIFO
    
    TempList;
    {
    MutexInScope(EventListMutex)
        copy everything in EventList to TempList
    }
    -*-
    signal everything in TempList

this fixes our thread thrashing, but it now gives us the usual lock-free type of restrictions - events in TempList are no longer protected by the Mutex, so that memory can't be reclaimed until we are sure that the worker is no longer touching them. (in practice the easiest way to do this is just to use recycling pooled allocators which don't empty their pool until application exit). Note that if an event is recycled and gets a different id, this might signal it, but that's not a correctness error because extra signals are benign. (extra signals just cause a wasteful spin around the loop to check if your GUID is done, no big deal, not enough signals means infinite wait, which is a big deal).

NOTE : you might think there's a race at -*- ; the apparent problem is that the worker has got the event list in TempList, and it gets swapped out, then on some main thread I add myself to EventList, run through and go to sleep in the Wait. Then worker wakes up at -*- and signals TempList - and I'm not in the list to signal! Oh crap! But this can't happen because if that was the work item I needed, it was already put in the receive FIFO and I should have seen it and returned before going into the Wait.

We can also of course get rid of the Mutex on EventList completely by doing the usual lockfree gymnastics; instead make it a message passing thing :


WaitOnMessage_6B( GUID , LocalEvent )
{
    Send Message {Add,LocalEvent,GUID}

    for(;;)
    {
        {
        MutexInScope(ReceiveMutex);
        pop everything on the receive FIFO
        mark everything I received as done
        if ( the GUID I wanted is done )
            break;
        }

        Wait( LocalEvent );
    }

    Send Message {Remove,LocalEvent,GUID}
}

and worker does :

    pop message from "send FIFO" to get work
    do work
    push message to "receive FIFO"

    pop all event messages
        process Add/Remove commands 

    signal everything in EventList

but this isn't tested and I have very little confidence in it. I would want to Relacey this before using it in real code.

4. Just don't do it. Sometimes domain-specific solutions are best. In particular there's a very simple solution I can use for the current problem I'm having.

The thing that made me hit this issue is that I made my work-stealing Worklet system able to wait on IO's. In this case the "worker" is the IO service thread and the messages are IO requests. So now the main thread can wait on IO and so can the Worklets. It's important that they really go to a proper sleep if they are blocked on IO.

But I can solve it in a simple way in that case. The Worklets already have a way to go sleep when they have no work to do. So all I do is if the Worklets only have work which depends on an uncompleted IO, they push their Work back onto the main work dispatch list and go to sleep on the "I have no work" event. Then, whenever the main thread receives any IO wakeup event, it immediately goes and checks the Worklet dispatch list and sees if anything was waiting on an IO completion. If it was, then it hands out the work and wakes up some Worklets to do it.

This solution is more direct and also has the nice property of not waking up unnecessary threads and so on.

1 comment:

cbloom said...

BTW I think this could be done much more neatly using condition variables.

old rants