Recently I've been fixing up a bunch of code that does things like
void MutexLock( Mutex * m )
if ( ! m ) return;
yikes. Invalid argument and you just silently do nothing. No thank you.
We should all know that silently nopping in failure cases is pretty horrible. But I'm also dealing with a lot of error code returns, and it occurs to me that returning an error code in that situation is not much better.
Personally I want unexpected or unhandleable errors to just blow up my app. In my own code I would just assert; unfortunately that's not viable in OS code or perhaps even in a library.
The classic example is malloc. I hate mallocs that return null. If I run out of memory, there's no way I'm handling it cleanly and reducing my footprint and carrying on. Just blow up my app. Personally whenever I implement an allocator if it can't get memory from the OS it just prints a message and exits (*).
(* = aside : even better is "functions that don't fail" which I might write more about later; basically the idea is the function tries to handle the failure case itself and never returns it out to the larger app. So in the case of malloc it might print a message like "tried to alloc N bytes; (a)bort/(r)etry/return (n)ull?". Another common case is when you try to open a file for write and it fails for whatever reason, it should just handle that at the low level and say "couldn't open X for write; (a)bort/(r)etry/change (n)ame?" )
I think error code returns are okay for *expected* and *recoverable* errors.
On functions that you realistically expect to always succeed and will not check error codes for, they shouldn't return error codes at all. I wrote recently about wrapping system APIs for portable code ; an example of the style of level 2 wrapping that I like is to "fix" the error returns.
(obviously this is not something the OS should do, they just have to return every error; it requires app-specific knowledge about what kind of errors your app can encounter and successfully recover from and continue, vs. ones that just mean you have a catastrophic unexpected bug)
For example, functions like lock & unlock a mutex shouldn't fail (in my code). 99% of the user code in the world that locks and
unlocks mutexes doesn't check the return value, they just call lock and then proceed assuming the lock succeeded - so don't return it :
void mypthread_mutex_lock(mypthread_mutex_t *mutex)
int ret = pthread_mutex_lock(mutex);
if ( ret != 0 )
When you get a crazy unexpected error like that, the app should just blow up right at the call site (rather
than silently failing and then blowing up somewhere weird later on because the mutex wasn't actually locked).
In other cases there are a mix of expected failures and unexpected ones, and the level-2 wrapper should differentiate
between them :
bool mysem_trywait(mysem * sem)
int res = sem_trywait( sem );
if ( res == 0 ) return true; // got it
int err = errno;
if ( err == EINTR )
// UNIX is such balls
else if ( err == EAGAIN )
// expected failure, no count in sem to dec :
// crazy failure; blow up :
(BTW best practice these days is always to copy "errno" out to an int, because errno may actually be
#defined to a function call in the multithreaded world)
And since I just stumbled into it by accident, I may as well talk about EINTR. Now I understand that there may be legitimate reasons why you *want* an OS API that's interrupted by signals - we're going to ignore that, because that's not what the EINTR debate is about. So for purposes of discussion pretend that you never have a use case where you want EINTR and it's just a question of whether the API should put that trouble on the user or not.
I ranted about EINTR at RAD a while ago and was informed (reminded) this was an ancient argument that I was on the wrong side of.
Mmm. One thing certainly is true : if you want to write an operating system (or any piece of software) such that it is easy to port to lots of platforms and maintain for a long time, then it should be absolutely as simple as possible (meaning simple to implement, not simple in the API or simple to use), even at the cost of "rightness" and pain to the user. That I certainly agree with; UNIX has succeeded at being easy to port (and also succeeded at being a pain to the user).
But most people who argue on the pro-EINTR side of the argument are just wrong; they are confused about what the advantage of the pro-EINTR argument is (for example Jeff Atwood takes off on a general rant against complexity ; I think we all should know by now that huge complex APIs are bad; that's not interesting, and that's not what "Worse is Better" is about; or Jeff's example of INI files vs the registry - INI files are just massively better in every way, it's not related at all, there's no pro-con there).
(to be clear and simple : the pro-EINTR argument is entirely about simplicity of implementation and porting of the API; it's about requiring the minimum from the system)
The EINTR-returning API is not simpler (than one that doesn't force you to loop). Consider an API like this :
U64 system( U64 code );
if the top 32 bits of code are 77 this is a file open and the bottom 32 bits specify a device; the
return values then are 0 = call the same function again with the first 8 chars of the file name ...
if it returns 7 then you must sleep at least 1 milli and then call again with code = 44 ...
etc.. docs for 100 pages ...
what you should now realize is that *the docs are part of the API*. (that is not a "simple" API)
An API that requires you to carefully read about the weird special cases and understand what is going on inside the system is NOT a simple API. It might look simple, but it's in disguise. A simple API does what you expect it to. You should be able to just look at the function signature and guess what it does and be right 99% of the time.
Aside from the issue of simplicity, any API that requires you to write the exact same boiler-plate every time you use it is just a broken fucking API.
Also, I strongly believe that any API which returns error codes should be usable if you don't check the error code
at all. Yeah yeah in real production code of course you check the error code, but for little test apps you
should be able to do :
int fd = open("blah");
and that should work okay in my hack test app. Nope, not in UNIX it doesn't. Thanks to its wonderful "simplicity"
you have to call "read" in a loop because it might decide to return before the whole read is done.
Another example that occurs to me is the reuse of keywords and syntax in C. Things like making "static" mean something completely different depending on how you use it makes the number of special keywords smaller. But I believe it actually makes the "API" of the language much *more* complex. Instead of having intuitive and obvious separate clear keywords for each meaning that you could perhaps figure out just by looking at them, you instead have to read a bunch of docs and have very technical knowledge of the internals of what the keywords mean in each usage. (there are legitimate advantages to minimizing the number of keywords, of course, like leaving as many names available to users as possible). Knowledge required to use an API is part of the API. Simplicity is determined by the amount of knowledge required to do things correctly.