extern "C" DWORD __cdecl FastTlsGetValue_x86(int index)
{
__asm
{
mov eax,dword ptr fs:[00000018h]
mov ecx,index
cmp ecx,40h // 40h = 64
jae over64 // Jump if above or equal
// return Teb->TlsSlots[ dwTlsIndex ]
// +0xe10 TlsSlots : [64] Ptr32 Void
mov eax,dword ptr [eax+ecx*4+0E10h]
jmp done
over64:
mov eax,dword ptr [eax+0F94h]
mov eax,dword ptr [eax+ecx*4-100h]
done:
}
}
DWORD64 FastTlsGetValue_x64(int index)
{
if ( index < 64 )
{
return __readgsqword( 0x1480 + index*8 );
}
else
{
DWORD64 * table = (DWORD64 *) __readgsqword( 0x1780 );
return table[ index - 64 ];
}
}
the ASM one is from nynaeve originally.
( 1
2 ).
I'd rather rewrite it in C using __readfsdword but haven't bothered.
Note that these may cause a bogus failure in MS App Verifier.
Also, as noted many times in the past, you should just use the compiler __declspec thread under Windows when that's possible for you. (eg. you're not in a DLL pre-Vista).
I'm confused. Is this entire post telling us what not to do, without telling us what to do? I understand that __declspec(thread) is preferred when not prohibited, but what about when it is prohibited?
ReplyDeleteThe post implicitly assumes that the reader is aware of or can find TlsAlloc/TlsGetValue (or FlsAlloc/FlsGetValue).
ReplyDeleteThe other option is to have your own "State" struct that you pass through every function in your code. The State can then be thread-local, or fiber-local, or job-local, etc.