2/23/2011

02-23-11 - Some little coder things - Error cleanup with break

I hate code that does error cleanup in multiple places, eg :

    FILE * fp = fopen(fileName,"wb");

    if ( ! stuff1() )
    {
        fclose(fp);
        return false;
    }
    
    if ( ! stuff2() )
    {
        fclose(fp);
        return false;
    }

    // ok!
    return true;

the error cleanup has been duplicated and this leads to bugs.

In the olden days we fixed this by putting the error return at the very end (after the return true) and using a goto to get there. But gotos don't play nice with C++ and are just generally deprecated. (don't get me started on setjmp, WTF is libpng thinking using that archaic error handling system? just because you think it's okay doesn't mean your users do)

Obviously the preferred way is to always use C++ classes that clean themselves up. In fact whenever someone gives me code that doesn't clean itself up, I should just immediately make a wrapper class that cleans itself up. I find myself getting annoyed and having bugs whenever I don't do this.

There is, however, a cleanup pattern that works just fine. This is well known, but I basically never ever see anyone use this, which is a little odd. If you can't use C++ self-cleaners for some reason, the next best alternative is using "break" in a scope that will only execute once.

For example :


rrbool rrSurface_SaveRRSFile(const rrSurface * surf, const char* fileName)
{
    FILE * fp = fopen(fileName,"wb");
    if ( ! fp )
        return false;
    
    for(;;)
    {
        rrSurface_RRS_Header header;
        
        if ( ! rrSurface_FillHeader(surf,&header) )
            break;
    
        if ( ! rrSurface_WriteHeader(fp,&header) )
            break;
        
        if ( ! rrSurface_WriteData(fp,surf,&header) )
            break;
        
        // success :
        
        fclose(fp);
        return true;
    }
    
    // failure :
    
    fclose(fp); 
    return false;
}

Really the break is just a simple form of goto that works with C++. When you have multiple things to cleanup obvious you have to check each of them vs uninitialized.

(BTW this example is not ideal because it doesn't give you any info about the failure. Generally I think all code should either assert or log about errors immediately at the site where the error is detected, not pass error codes up the chain. eg. even if this code was "good" and had a different error return value for each type of error, I hate that shit, because it doesn't help me debug and get a breakpoint right at the point where the error is happening.)

ADDENDUM :

Another common style of error cleanup is the "deep nest with partial cleanup in each scope". Something like this :


  bool success = false;
  if ( A = thing1() )
  {
    if ( B = thing2() )
    {
      if ( C = thing3() )
      {
        success = true;
        cleanup C;
      }
      cleanup B;
    }
    cleanup A;
  }

I really hate this style. While it doesn't suffer from duplication of the cleanups, it does break them into pieces. But worst, it makes the linear code flow very unclear and introduces a deep branching structure that's totally unnecessary. Good code should be a linear sequence of imperatives as much as possible. (eg. do X, now do Y, now do Z).

I think this must have been an approved Microsoft style at some point because you see it a lot in MSDN samples; often the success code path winds up indented so far to the right that it's off the page!

11 comments:

Justin Paver said...

I use a slightly different idiom, but only for error handling.

#define breakAndGotoLabel(LBL) \
do { \
assert(0); \
goto LABEL; \
} while (0)

void func()
{
if (someError)
breakAndGotoLabel(FAIL)

// success
return true;
FAIL:
cleanup();
return false;
}

The advantage there is that you can cleanup code even if you have a multiple nested scope (break doesn't work for that), and you assert/log at the site of the error in debug builds. The disadvantage is that everyone jumps down your throat because you used goto (it being "evil" and all that". The way I rationalize it is that I'm hardly using it for complex flow control, and it significantly increases the readability of the code.

Fabian said...

The "wrap it inside a scope and use break" idea is good, think I'll use that from now, though I'm probably gonna use "do { ... } while(0);" instead of a for(;;) with return.

"even if this code was "good" and had a different error return value for each type of error, I hate that shit, because it doesn't help me debug and get a breakpoint right at the point where the error is happening"
A nice way to do this is just:

"static inline ErrorCode Error(ErrorCode x) { return x; }"

then use "return Error(blah);" instead of "return blah;". Then you can put an assert into Error() (either always fault or fault on specific errors) and you can easily grep for paths that return error codes.

penwan said...

You've still got your cleanup code in two places: success and failure. A simpler solution:

rrbool rrSurface_SaveRRSFile(const rrSurface * surf, const char* fileName)
{
FILE * fp = fopen(fileName,"wb");
if ( ! fp )
return false;

rrbool result = false; // we failed unless we succeed
do
{
rrSurface_RRS_Header header;

if ( ! rrSurface_FillHeader(surf,&header) )
break;

if ( ! rrSurface_WriteHeader(fp,&header) )
break;

if ( ! rrSurface_WriteData(fp,surf,&header) )
break;

// success :
result = true;
} while( 0 )

// finally :

fclose(fp);
return result;
}

cbloom said...

Yeah, ok.

Though I should be clear that really the preferred way is C++ wrappers and this is just sort of the least-bad way.

It's still just too easy to slip in a return somewhere and fail to clean up, or to write the wrong cleanup code. There's absolutely no reason why objects should not be self-cleaning.

If for some weird reason you can't do that, you can also get the same effect using an on-destruct callback.

In cblib I would do :

FILE * fp = fopen(..)
ON_DESTRUCT1(fclose,fp);

Tom Forsyth said...

Why is the break example better than gotos? Using break has all the usual problems with breaks if you have any nesting. Adding a nesting or being at all careless with cut'n'paste will screw you. I now regard break as a bug waiting to happen for any loop longer than 20 lines.

cbloom said...

" Why is the break example better than gotos? "

Just because break works with C++ and goto doesn't.

castano said...

"ON_DESTRUCT1(fclose,fp);"

Ugh, C++ lack of closures annoys me to no end!

I know, C++0x...

Fabian said...

"Just because break works with C++ and goto doesn't."
Where's that coming from, btw?

It gets tricky if you just jump into the middle of some scope, but if you put a label right after the end of a scope you're currently nested in those problems don't occur (the main problem is visibility of symbols that will be declared in the current block but haven't been declared yet - a non-issue if you jump into a scope where those symbols aren't visible anyway).

As far as I can tell, "emulating" breaks (or multi-level breaks) with gotos is always safe. Any specific case where that doesn't work?

cbloom said...

Yeah, it's fine as long as you just emulate a break, but what people actually do with gotos is something like

ret = false;

if ( ! (A = makeA) )
goto fail;

if ( ! (B = makeB) )
goto fail;

ret = true;

fail:

cleanup(A);
cleanup(B);

return ret;


the nasty thing, why I think goto should be avoided, is that if you happen to write a function like this that doesn't use any classes, then it will work just fine. Now you have created a function that I cannot use C++ in without rewriting how you do cleanup.

So what I'm trying to say is that "what people mean when they say goto-style error handling doesn't work with C++".

If you write it in the "break style", then sure you can use goto instead of break if you prefer.

michael said...

The only good way to solve this is by using RAII. No other method, goto, break or whatever, is going to be as complete.

If there is an exception thrown in any function called from within the function then the resources will not be cleaned up correctly (unless you plan on adding try/catch everywhere).

If this were my code, I would write a sharable ref counted class to wrap the file. No clean up code is needed then. And clean up, in the event of an exception, is automatically handled when the stack is cleaned up.

The following code is of the top of my head, doesn't test for valid file pointer etc. but otherwise, is the best way to handle resource management in C++.


class File
{
public:
File(FILE * fp):m_fp(fp){}
~File(){fclose(m_fp);}
protected:
private:
FILE * m_fp;
};

typedef boost::shared_ptr FilePtr;

FilePtr MyFunction(const char * pPath)
{
FilePtr spFile(new File(fopen(pPath)));

SomeFunctionWhichMightThrow();

return spFile;
}

cbloom said...

Michael, I believe I said that already. Several times.

old rants