1/30/2009

01-30-09 - Stack Tracing on Windows

There are basically 3 ways to capture stack traces on Windows.

1. Manually walking ebp/esp ; this steps back through the frame pointers, it relies on the callers stack being pushed. This is basically what RtlCaptureStackBackTrace or DmCaptureStackBackTrace does, but you can also just write it yourself very easily. The advantage of this is it's reasonably fast. The disadvantage is it doesn't work on all CPU architectures, and it doesn't work with the frame pointer omission optimization.

For info on RtlCaptureStackBackTrace, see Undocumented NT Internals or MSDN

2. StackWalk64. This is the new API you're supposed to use. The advantage is it works on all CPUs and it even works with frame pointer omission (!). But you can see from that latter fact that it must be very slow. In order to work with FPO it loads the PDB and uses the instruction pointer map to figure out how to trace back. It also can trace through lots of system calls that normal ebp-walking fails on.

See gamedev.net or ExtendedTrace for examples. But it's really too slow.

3. Manual push/pop in prolog/epilog. Uses the C compiler to stick a custom enter/leave on every function that does a push & pop to your own stack tracker. Google Perftools has an option to work this way. The "MemTracer" project works this way (more on MemTracer some day). The nice thing about this is it works on any architecture as long as the prolog/epilog is supported. The disadvantage is it adds a big overhead even on functions that you never trace. That rather sucks. Stacktraces are very rare in my world, so I want to pay the cost of them only when I actually do them, I don't want to be pushing & popping stack info all the time.

3 comments:

castano said...

I've been playing with that stuff today and I was not able to get StackWalk64 to work without frame pointers. I would get garbage, while RtlCaptureStackBackTrace simply returned 0.

Something that puzzled me for a while was that the compiler was removing the frame pointers even when the FPO optimization was disabled. I had to explicitly set the /Oy- option to prevent that from happening.

On the other side, both StackWalk64, and RtlCaptureStackBackTrace worked fine on win64 without frame pointers. Win64 embeds additional meta-data in the binaries to make that possible: http://www.nynaeve.net/?p=101

cbloom said...

Hmm.. not sure what the problem was with StackWalk , maybe it wasn't finding your PDB for some reason.

In any case, Win64 definitely makes this all way nicer. It's because of the change of the exception model :

http://cbloomrants.blogspot.com/2010/06/06-07-10-exceptions.html

you now get gauranteed frames everywhere.

castano said...

Turns out that the problem is that RtlCaptureContext does not fill the context properly when FPO is enabled, so on x86 you have to do it manually. The following code seems to work:

CONTEXT ctx = { 0 };
ctx.ContextFlags = CONTEXT_CONTROL;
_asm {
call x
x: pop eax
mov ctx.Eip, eax
mov ctx.Ebp, ebp
mov ctx.Esp, esp
}

One nice feature of the StackWalk64 API is that you can pass the context provided to the unhandled exception filter, trying that out is when I noticed what was causing the problem.

old rants