3/19/2013

03-19-13 - Windows Sleep Variation

Hmm, I've discovered that Sleep(n) behaves very differently on my three Windows boxes.

(Also remember there are a lot of other issues with Sleep(n) ; the times are only reliable here because this is in a no-op test app)

This actually started because I was looking into Linux thread sleep timing, so I wrote a little test to just Sleep(n) a bunch of times and measure the observed duration of the sleep.

(Of course on Windows I do timeBeginPeriod(1) and bump my thread to very high priority (and timeGetDevCaps says the minp is 1)).

Anyway, what I'm seeing is this :


Win7 :
sleep(1) : average = 0.999 , sdev = 0.035 ,min = 0.175 , max = 1.568
sleep(2) : average = 2.000 , sdev = 0.041 ,min = 1.344 , max = 2.660
sleep(3) : average = 3.000 , sdev = 0.040 ,min = 2.200 , max = 3.774

Sleep(n) averages n
duration in [n-1,n+1]

WinXP :
sleep(1) : average = 1.952 , sdev = 0.001 ,min = 1.902 , max = 1.966
sleep(2) : average = 2.929 , sdev = 0.004 ,min = 2.665 , max = 2.961
sleep(3) : average = 3.905 , sdev = 0.004 ,min = 3.640 , max = 3.927

Sleep(n) averages (n+1)
duration very close to (n+1) every time (tiny sdev)

Win8 :
sleep(1) : average = 2.002 , sdev = 0.111 ,min = 1.015 , max = 2.101
sleep(2) : average = 2.703 , sdev = 0.439 ,min = 2.017 , max = 3.085
sleep(3) : average = 3.630 , sdev = 0.452 ,min = 3.003 , max = 4.130

average no good
Sleep(n) minimum very precisely n
duration in [n,n+1] (+ a little error)
rather larger sdev

it's like completely different logic on each of my 3 machines. XP is the most precise, but it's sleeping for (n+1) millis instead of (n) ! Win8 has a very precise min of n, but the average and max is quite sloppy (sdev of almost half a milli, very high variation even with nothing happening on the system). Win7 hits the average really nicely but has a large range, and is the only one that will go well below the requested duration.

As noted before, I had a look at this because I'm running Linux in a VM and seeing very poor performance from my threading code under Linux-VM. So I ran this experiment :


Sleep(1) on Linux :

native : average = 1.094 , sdev = 0.015 , min = 1.054 , max = 1.224
in VM  : average = 3.270 , sdev =14.748 , min = 1.058 , max = 656.297

(added)
in VM2 : average = 1.308 , sdev = 2.757 , min = 1.052 , max = 154.025

obviously being inside a VM on Windows is not being very kind to Linux's threading system. On the native box, Linux's sleep time is way more reliable than Windows (small min-max range) (and this is just with default priority threads and SCHED_OTHER, not even using a high priority trick like with the Windows tests above).

added "in VM2". So the VM threading seems to be much better if you let it see many fewer cores than you have. I'm running on a 4 core (8 hypercore) machine; the base "in VM" numbers are with the VM set to see 4 cores. "in VM2" is with the VM set to 2 cores. Still a really bad max in there, but much better overall.

10 comments:

  1. Interesting!

    I wonder, could you try these out and compare the results:
    - http://en.cppreference.com/w/cpp/thread/sleep_for
    - http://en.cppreference.com/w/cpp/thread/sleep_until
    // Just out of curiosity.

    ReplyDelete
  2. Eh, no thanks.

    The C++11 api doc there says that the sleep must be >= the specified time. That means they shouldn't be calling Sleep(n) at all, which has always been unreliable; they should be using the win32 waitable timer mechanism, which does at least guarantee the >= inequality.

    BTW for the record : no (normal) app should ever Sleep(n) (or sleep_for or sleep_until). Never ever; it's broken code.

    (I use it in my thread stress test; the main Oodle code doesn't ever sleep(n))

    ReplyDelete
  3. Did you make sure to delay a random amount of time so the Sleep start time was decorrelated from any regular scheduling timer?

    ReplyDelete
  4. Remember that Sleep() on Windows is driven by a 1 KHz timer. Therefore the implementation has too choose between rounding up and rounding down when choosing which timer interrupt to wake up on. Windows XP rounds up, which makes for a minimum sleep of 2 ms. Windows 7 appears to round down.

    ReplyDelete
  5. Gee thanks Bruce, that really added a lot.

    ReplyDelete
  6. @nothings - I think the main issue is that the tests basically do sleep-sleep-sleep over and over, so the first one is going to get aligned with the timer interrupt, and then after that they fall roughly on alignment.

    I think WinXP's behavior is easiest to explain. What seems to be happening in the WinXP case is the test gets roughly aligned to the interrupt. The thread wakes up and then says Sleep(2) again. The OS puts us to sleep for *at least* 2 millis. Because there's some time passed since the wake up, that means we will sleep 2 time intervals plus almost one more full interval.

    Thus on WinXP the sleep times are all just under n+1 and the sdev is quite small because we that almost exactly each time.

    That's the behavior I would have guessed for all Windows. The surprise was the variation.

    And to be clear once again : this post is just a bit of tech trivia amusement, not a serious issue anyone should care about.

    ReplyDelete
  7. It makes me want to record an xperf trace to see where the variation is coming from. On an unloaded multi-core system I would expect really low variance on all systems. Windows 7 and Windows 8 both allow recording enough xperf/ETW data to figure out exactly what is going on.

    An xperf trace shows every context switch for every process on the system so you can see when your process is woken, early, late, whatever, and why.

    ReplyDelete
  8. That was an impressive deflection of my dickiness! Well done, sir.

    Yeah that would be interesting. I may try it...

    ReplyDelete
  9. I did the xperf test to see what was happening. Sleep() itself is (modulo the XP/later-versions change) ridiculously consistent.

    http://randomascii.wordpress.com/2013/04/02/sleep-variation-investigated/

    ReplyDelete
  10. Cool; interesting to know that it's actually rounding *down* not to nearest.

    I wish the MSDN docs were a bit better at identifying changes over time.

    ReplyDelete