Everybody just writes code like this :
U32 * bigTable = malloc(20<<20);
but that's a huge waste. (eg. for large hash table on small files the memset can dominate your time).
Behind your back, the operating system is actually running a thread all the time as part of the System Idle Process which grabs free pages and writes them with zero bytes and puts them on the zero'ed page list.
When you call VirtualAlloc, it just grabs a page from the zeroed page list and hands it to you. (if there are none available it zeroes it immediately).
!!! Memory you get back from VirtualAlloc is always already zeroed ; you don't need to memset it !!!
The OS does this for security, so you can never see some other app's bytes, but you can also use it to get zero'ed tables quickly.
(I'm not sure if any stdlib has a fast path to this for "calloc" ; if so that might be a reason to prefer that to malloc/memset; in any case it's safer just to talk to the OS directly).
ADDENDUM : BTW to be fair none of my string matchers do this, because other people's don't and I don't want to win from cheap technicalities like that. But all string match hash tables should use this.