7/13/2020

Robust Win32 IO

I see far too much code in production that does not use Win32 IO robustly. Some of the issues are subtle and tricky, but many of them just come down to checking error codes and return values. You cannot assume :
ReadFile(size) either successfully reads all "size" bytes, or fails mysteriously and we should abort
What you actually need to be handling is :
ReadFile(size)
succeeded but got less than size
failed but failed due to being already at EOF
failed but failed due to a temporary system condition that we should retry
succeeded but is not asynchronous the way we expected
succeeded and was asynchronous but then GetOverlapped result does not wait as we expected
failed but failed due to IO size being too big and we should cut it into pieces

In a surely pointless attempt to improve matters, I've tried to make easy to use clean helpers that do all this for you, so you can just include this code and have robust IO :

robustwin32io.zip

Even if you are being careful and checking all the error codes, some issues you may not be handling :

  • Cut large IOs into pieces. You may have memory allocated for your large IO buffer, but when you create an IO request for that, the OS needs to mirror that into the disk cache, and into kernel memory space for the IO driver. If your buffer is too large, that can fail due to running out of resources.

    (this is now rarely an issue on 64-bit windows, but was common on 32-bit windows, and can still happen on the MS consoles)

  • Retry IOs in case of (some) failures. One of the causes of IO failure if too many requests in the queue, for example if you are spamming the IO system generating lots of small request. If you get these failures you should wait a bit for the queue to drain out then retry.

  • Always call GetOverLappedResult(FALSE) (no wait) before GetOverLappedResult(TRUE) (wait) to reset the event. If you don't do this, GetOverLappedResult(TRUE) can return without waiting for the IO to return, causing a race against the IO. This behavior was changed in Windows 7 so this might not be necessary any more, but there's some dangerous behavior with the manual-reset Event in the OVERLAPPED struct. When you start an async IO it is not necessarily reset to unsignaled. When you GetOverLappedResult(TRUE) it is supposed to be waiting on an event that is signalled when the IO completes, but if the event was already set to signalled before you called, it will just return immediately.

    NOTE this is not the issue with trying to do GetOverLappedResult on the same OVERLAPPED struct from multiple threads - that is just wrong; access to the OVERLAPPED struct should be mutex protected if you will query results from multiple threads, and you should also track your own "is io async" status to check before calling GetOverLappedResult.

  • Always call SetLastError(0) before calling any Windows API and then doing GetLastError. See previous blog on this topic : 10-03-13 - SetLastError(0). This particular bug was fixed in Windows Vista (so a while ago), but I'm paranoid about it and it's harmless to do, so I still do it. GetLastError/SetLastError is just a variable in your thread-info-block, so it's only a few instructions to access it. It's best practice to always SetLastError(0) at the start of a sequence of operations, that way you know you aren't getting errors that were left over from before.

For example, here's how to call GetOverlappedResult : (only call if st == win32_io_started_async)

BOOL win32_get_async_result(HANDLE handle,
                            OVERLAPPED * data,
                            DWORD * pSize)
{
    // only call this if you got "win32_io_started_async"
    // so you know IO is actually pending
            
    DWORD dwSize = 0;
    
    // first check result with no wait
    //  this also resets the event so that the next call to GOR works :
    
    if ( GetOverlappedResult(handle,data,&dwSize,FALSE) )
    {
        if ( dwSize > 0 )
        {
            *pSize = (DWORD) dwSize;
            return true;
        }
    }   
    
    // if you don't do the GOR(FALSE)
    //  then the GOR(TRUE) call here can return even though the IO is not actually done
    
    // call GOR with TRUE -> this yields our thread if the IO is still pending
    if ( ! GetOverlappedResult(handle,data,&dwSize,TRUE) )
    {
        DWORD err = GetLastError();
        
        if ( err == ERROR_HANDLE_EOF )
        {
            if ( dwSize > 0 )
            {
                *pSize = (DWORD) dwSize;
                return true;
            }
        }
        
        return false;       
    }
        
    *pSize = (DWORD) dwSize;
    
    return true;    
}           

Get the code :

robustwin32io.zip

Also note that I don't recommend trying to do unbuffered writes on Windows. It is possible but complex and requires privilege elevation which puts up a UAC prompt, so it's not very usable in practice. Just do buffered writes. See also : 01-30-09 - SetFileValidData and async writing and 03-12-09 - ERROR_NO_SYSTEM_RESOURCES

3 comments:

Thom said...

I think this code has a bug in the sync read code (I don't use the writing functions so I don't know if it's there too, didn't look).

Essentially, the problem is `win32_sync_read_sub` returns an error if asked to do a read that starts at EOF, but `win32_sync_read` can request a read like that in later iterations of the loop if `pos + N * MAX_SINGLE_IO_SIZE` happens to be exactly the file size, and `pos + size` happens to go past the end of the file (which seems to be supported otherwise).

This is rare, but can happen if you have files with sizes which are an exact multiple of 2MB (e.g. 1<<21, MAX_SINGLE_IO_SIZE) and do a read starting at 0 with an overly large buffer.

In my case it was trivial to fix, and had the code not explicitly advertised as "robust" I'd probably not bother mentioning it and just assume that it's buggy if `pos + size` goes past the end of the file in rare cases, but as it is it's probably worth fixing for the sake of robustness.

cbloom said...

Yes. In that case it should still read the bytes and fill *pGotSize correctly, but it returns FALSE which is misleading and not well documented.

I think probably the cleanest thing is to explicitly return that EOF was reached, so instead of returning BOOL return a trinary. The EOF handling in general in this code is not great.

To be clear the claims of "robust" are optimistic at best, since I don't actually use this code as-is, and untested/unused code is always crap. (good coder principle is to always ship the exact same code that you use every day, not have a ship version vs. daily version). At the moment it's more of a demonstration of some issues that I see people fail to handle regularly. That said I'll try to maintain it so actual usability improves.

cbloom said...

.. updated to make EOF a first class state, which cleans up things a bit. Hitting EOF during a read now counts as success, but the number of bytes read might be less than he number requested.

Caveat above still applies : this is a basically untested extraction of code to make it dependency-free that isn't what I actually use, so apologies if it's not great.

old rants