5/26/2010

05-26-10 - Windows Page Cache

The correct way to cache things is through Windows' page cache. The advantage from doing this over using your own custom cache code is :

1. Automatically resizes based on amount of memory needed by other apps. eg. other apps can steal memory from your cache to run.

2. Automatically gives pages away to other apps or to file IO or whatever if they are touching their cache pages more often.

3. Automatically keeps the cache in memory between runs of your app (if nothing else clears it out). This is pretty immense.

Because of #3, your custom caching solution might slightly beat using the Windows cache on the first run, but on the second run it will stomp all over you.

To do this nicely, generally the cool thing to do is make a unique file name that is the key to the data you want to cache. Write the data to a file, then memory map it as read only to fetch it from the cache. It will now be managed by the Windows page cache and the memory map will just hand you a page that's already in memory if it's still in cache.

The only thing that's not completely awesome about this is the reliance on the file system. It would be nice if you could do this without ever going to the file system. eg. if the page is not in cache, I'd like Windows to call my function to fill that page rather than getting it from disk, but so far as I know this is not possible in any easy way.

For example : say you have a bunch of compressed images as JPEG or whatever. You want to keep uncompressed caches of them in memory. The right way is through the Windows page cache.

2 comments:

Autodidactic Asphyxiation said...

I think using FILE_ATTRIBUTE_TEMPORARY has the "avoid filesystem at all costs" behavior, but obviously lacks the cross-invocation persistence. I think what you could do is have an overlay system. Say I want the uncompressed goatse.jpg:

1) If goatse.jpg.tmp exists, use it.

2) Otherwise, make a temporary file, and decompresss goatse.jpg into it.

3) All new temporary files are queued as a really low-priority entity on a shared disk thread. When the disk IO thread is idle, it does MoveFileEx to commit the temporary data to the filesystem. Make sure you flush this queue on program shutdown.

Autodidactic Asphyxiation said...

Something that wasn't clear, when you make the temporary file, just use GetTemporaryFilename. When you "commit" it, make a useful name, like goatse.jpg.tmp.

old rants