To start see Jon Watte's old summary that's still good .
Basically you have timeGetTime() , QPC, or TSC.
TSC is fast (~ 100 clocks) and high precision. The problems I know of with TSC :
TSC either tracks CPU clocks, or time passing. On older CPUs it actually increments with each cpu cycle, but on newer CPUs it just tracks time (!). The newer "constant rate" TSC on Intel chips runs at some frequency which so far as I can tell you can't query.
If TSC tracks CPU cycles, it will slow down when the CPU speedsteps. If the CPU goes into a full sleep state, the TSC may stop running entirely. These issues are bad on single core, but they're even worse on multi-proc systems where the cores can independently sleep or speedstep. See for example these linux notes or tsc.txt .
Unfortunately, if TSC is constant rate and tracking real time, then it no longer tracks cpu cycles, which is actually what you want for measuring performance (you should always report speeds of micro things in # of clocks, not in time).
Furthermore on some multicore systems, the TSC gets out of sync between cores (even without speedsteps or power downs). If you're trying to use it as a global time, that will hose you. On some systems, it is kept in sync by the hardware, and on some you can get a software patch that makes rdtsc do a kernel interrupt kind of thing which forces the TSC's of the cores to sync.
See this email I wrote about this issue :
Apparently AMD is trying to keep it hush hush that they fucked up and had to release a hotfix. I can't find any admission of it on their web site any more ;
this is the direct download of their old utility that forces the cores to TSC sync : TscSync
they now secretly put this in the "Dual Core Optimizer" : Dual Core Optimizer Oh, really AMD? it's not a bug fix, it's an "optimizer". Okay.
There's also a seperate issue with AMD C&Q (Cool & Quiet) if you have multiple cores/processors that decide to clock up & down. I believe the main fix for that now is just that they are forbidden from selecting different clocks. There's an MS hot fix related to that : MS hotfix 896256
I also believe that the newest version of the "AMD Processor Driver" has the same fixes related to C&Q on multi-core systems : AMD Driver I'm not sure if you need both the AMD "optimizer" and processor driver, or if one is a subset of the other.
Okay, okay, so you decide TSC is too much trouble, you're just going to use QPC, which is what MS tells you to do anyway. You're fine, right?
Nope. First of all, on many systems QPC actually is TSC. Apparently Windows evaluates your system at boot and decides how to implement QPC, and sometimes it picks TSC. If it does that, then QPC is fucked in all the ways that TSC is fucked.
So to fix that you can apply this : MS hotfix 895980 . Basically this just puts /USEPMTIMER in boot.ini which forces QPC to use the PCI clock instead of TSC.
But that's not all. Some old systems had a bug in the PCI clock that would cause it to jump by a big amount once in a while.
Because of that, it's best to advance the clock by taking the delta from previous and clamping that delta to be in valid range. Something like this :
U64 GetAbsoluteQPC()
{
static U64 s_lastQPC = GetQPC();
static U64 s_lastAbsolute = 0;
U64 curQPC = GetQPC();
U64 delta = curQPC - s_lastQPC;
s_lastQPC = curQPC;
if ( delta < HUGE_NUMBER )
s_lastAbsolute += delta;
return s_lastAbsolute;
}
(note that "delta" is unsigned, so when QPC jumps backwards, it will show up as as very large positive delta, which is why we compare vs
HUGE_NUMBER ; if you're using QPC just to get frame times in a game, then a reasonable thing is to just get the raw delta from the last
frame, and if it's way out of reasonable bounds, just force it to be 1/60 or something).
Urg.
BTW while I'm at I think I'll evangelize a "best practice" I have recently adopted. Both QPC and TSC have problems with wrapping. They're in unsigned integers and as your game runs you can hit the end and wrap around. Now, 64 bits is a lot. Even if your TSC frequency is 1000 GigaHz (1 THz), you won't overflow 64 bits for 194 days. The problem is they don't start at 0. (
Unsigned int wrapping works perfectly when you do subtracts and keep them in unsigned ints. That is :
in 8 bits : U8 start = 250; U8 end = 3; U8 delta = end - start; delta = 8;
That's cool, but lots of other things don't work with wrapping :
U64 tsc1 = rdtsc(); ... some stuff ... U64 tsc2 = rdtsc(); U64 avg = ( tsc1 + tsc2 ) /2;
This is broken because tsc may have wrapped.
The one that usually gets me is simple compares :
if ( time1 < time2 )
{
// ... event1 was earlier
}
are broken when time can wrap. In fact with unsigned times that wrap there is no way to tell which one came first (though you could if you
put a limit on the maximum time delta that you consider valid - eg. any place that you compare times, you assume they are within 100 days of
each other).
But this is easily fixed. Instead of letting people call rdtsc raw, you bias it :
uint64 Timer::GetAbsoluteTSC()
{
static uint64 s_first = rdtsc();
uint64 cur = rdtsc();
return (cur - s_first);
}
this gives you a TSC that starts at 0 and won't wrap for a few years. This lets you just do normal compares everywhere to know what came
before what. (I used the TSC as an example here, but you mainly want QPC to be the time you're passing around).