07-18-11 - cblib Relacy

Announce : cblib now has its own version of "Relacy".

This is no replacement for the original - go get Dmitry's Relacy Race Detector if you are serious about low level threading, you must have this. It's great. (I'm using 2.3.0).

Now in cblib is "LF/cblibRelacy.h" which is a look-alike for Relacy.

What you do, is you write code for Relacy just like you normally would. That is, you write C++0x but you add $ on all the shared variable accesses (and use rl::backoff and a few others things). You set up a test class and run rl::simulate.

Now you can change your #include "relacy.h" to #include "cblib/lf/cblibRelacy.h" and it should just transparently switch over to my version.

What does my version do differently?

0. First of all, it is no replacement for the real Relacy, which is a simulator that tries many possible races; cblibRelacy uses the real OS threads, not fibers, so the tests are not nearly as comprehensive. You need to still do your tests in the real Relacy first to check that your algorithm is correct.

1. It gives you actually usable compiled code. eg. if you take the clh_mutex or anything I've posted recently and combine it with cblibRelacy.h , you have a class you can go and use. (but don't literally do that)

2. It runs its tests on the actual compiled code. That means you aren't testing in a simulator which might hide problems with the code in the real world (eg. if your implementation of atomics has a bug, or if there's a priority inversion caused by the scheduling of real OS threads).

3. Obviously you can test OS primitives that Relacy doesn't support, like Nt keyed events, or threads waiting on IO, etc.

4. It can run a lot more iterations than the real Relacy because it's using real optimized code; for example I found bugs with my ticket locks when the ticket counters overflowed 16 bits, which is way more iterations that you can do in real Relacy.

How does it work :

I create a real Win32 thread for each of the test threads. The atomic ops translate to real atomic ops on the machine. I then run the test many times, and try to make the threads interleave a bit to make problems happen. The threads get stopped and started at different times and in different orders to try to get them to interleave a bit differently on each test iteration.

Optionally (set by cblibRelacy_do_scheduler), I can also use the Relacy $ points to juggle my scheduling, just the same way Relacy does with fibers. Wherever a $ occurs (access to a shared variable), I randomize an operation and might spin a bit or sleep the thread or some other things. This gives you way more interleaving than you would get just from letting the OS do thread switching.

Now as I said this is no substitute for the real Relacy, you'll never get as many fine switches back and forth as he does (well, you could of course, if you made the $ juggle your thread almost always, but that would actually make it much slower than Relacy because he uses fibers to make all the switching less expensive).

One important note - cblibRelacy will not stress your test very well unless you run it with more threads than you have cores. The reason is that if your system is not oversubscribed, then the SwitchToThread() that I use in $ will do nothing.

Also, don't be a daft/difficult commenter. This code is intended as a learning tool, it's obviously not ready to be used directly in a production environment (eg. I don't check any OS return codes, and I intentionally make failures be asserts instead of handling them gracefully). If you want some of these primitives, I suggest you learn from them then write your own versions of things. Or, you know, buy Oodle, which will have production-ready multi-platform versions of lots of stuff.

ADDENDUM : I should also note to be clear : Relacy detects races and other failure types and will tell you about them. cblibRelacy just runs your code. That means to actually make it a test, you need it to do some asserting. For example, with real Relacy you can test your LF Stack by just pushing some nodes and popping some nodes. With cblibRelacy that wouldn't tell you much (unless you crash). You need to write a test that pushes some values, then pops some value and asserts that it got the same things out.


nothings said...

How portable is '#define $ foo'? That seems a crazy choice to me, but I guess you were limited by Relacy compatibility.

cbloom said...

So far as a I know it's portable; it's always a bit dangerous using such a short identifier, if anyone else had the crazy idea of using $ then you have conflict problems.

But anything longer would make the syntax unbearable.

The $ has to be a #define to get FILE and LINE (which Relacy uses to make the debug trace nice). Otherwise the scheduler struct could be inserted with a class shim.

My stuff should still work fine without the $'s , you just won't get the reschedule points. So they can be removed for production code.

ryg said...

That's not what Sean means (I think); "$" is not actually in the supported input character set for C identifiers, nor in the set required to be supported by the preprocessor. The preprocessor is allowed to support $ (and most do), but this is an extension. See e.g. here.

cbloom said...

Sometimes "so far as I know" doesn't go very far...

cbloom said...

You can actually use Relacy without the $

He has a macro VAR so instead of

xx($) you do


but I like the look of the code much better with the $.

old rants