cbloom rants: 02-19-09 - Thread Safety Levels

2/19/2009

02-19-09 - Thread Safety Levels

I am trying to mark up my code with explicit comments about 3 levels of thread safety. I think this is a good concept that I haven't really seen discussed much :

Completely Thread-Safe (CTS) :

This function touches only local variables, objects through locks, objects which are intentionally & safely lock-free, TLS variables, and other stuff that is totally thread safe.

Object Thread-Safe (OTS) :

These functions can touch anything that is CTS, and also can touch any objects passed into them. If the objects passed into them are completely owned by the caller, then they are CTS. Class member functions should typically be OTS for example.

Not Thread-Safe (NTS) :

These functions touch globals or something and are just not thread safe. You must ensure they are run in sequential order.

So, for example, most of my init & shutdown code is NTS. I assume that you do inits, then start threads, then kill threads, then do shutdowns.

It would be really awesome if I could mark up functions in C++ with extra constraints. Then I could stick "CTS" on a function decl and the compiler could tell me if it does anything that doesn't comply with the CTS constraint. OTS can call CTS. NTS can call anything. If CTS calls NTS, it's a compile error.

Another thing that would sure be handy is a way to find all statics and globals. Fucking C++ has reused the reserved words so much, there's no word at all for globals, and "static" is used for so many things it's not a very useful search.

BTW another level that might be needed is Single-Thread-Safe :

These functions are thread-safe only if they are always called from the same thread. That does not necessarilly mean that they are only touching data which that thread exclusively owns - they may be touching shared data, but in a careful way that works only if only one thread is writing the data. One obvious example is the lock-free single-producer data structures. They are not really CTS because if you call Push on them from any thread but the owner you are borked.

11 comments:

Brian said...: A bunch of people are working on pluggable type systems to do this kind of checking. But most of those are for Java.; February 19, 2009 at 7:42 PM
castano said...: One of the cool things about C# is that you can attach attributes to methods and write analysis tools that process the resulting assemblies and verify those properties.

One of the reasons why I like Qt is because all the classes are properly annotated as reentrant or thread-safe. In Qt terminology reentrant means OTS and thread-safe means CTS.

http://cartan.cas.suffolk.edu/qtdocs/threads.html#reentrancy-and-thread-safety

While this is not the most common definition of "reentrant", it's the one that I like the most.; February 19, 2009 at 10:09 PM
MH said...: Yeah, I wish C++2xxx had generalized macro language like Ruby or lisp. Just remove the current textual macro beast and replace it with a language level system with similar syntax. Poof.

OpenC++ is supposed to provide something like that, but its ugly.

C#'s is ok, but I dont think you can check this stuff at compile time, only runtime which is nice for some things, but not as useful for others.; February 19, 2009 at 10:13 PM
cbloom said...: "Yeah, I wish C++2xxx had generalized macro language like Ruby or lisp. Just remove the current textual macro beast and replace it with a language level system with similar syntax. Poof."

Yeah, I often wish for this, and it would be fun to play with, but at the same time I am absolutely terrified of what reading other people's code would be like with this. Code sharing would become much harder as everybody would develop their own metalanguages.; February 19, 2009 at 10:25 PM
castano said...: "C#'s is ok, but I dont think you can check this stuff at compile time, only runtime which is nice for some things, but not as useful for other"

No, in C# you can easily write your own tool that inspects the compiled assemblies using the .net reflection API or the more powerful Cecil library:

http://www.mono-project.com/Cecil

See for example gendarme:

http://www.mono-project.com/Gendarme.Rules.Concurrency; February 19, 2009 at 11:22 PM
MH said...: Oh nice, I should have thought of using those tools to write an offline tool to do that kind of checking.

I already use heavy reflection and Attributes in my home apps to do magical things like automatically building network packets, save/load, easy state machines, etc.; February 20, 2009 at 12:04 AM
Unknown said...: This comment has been removed by the author.; February 20, 2009 at 1:09 AM
Unknown said...: RE: Marking functions as thread safe etc

Not sure if you've seen this link.

http://herbsutter.spaces.live.com/blog/cns!2D4327CC297151BB!207.entry

Hopefully its of use.; February 20, 2009 at 1:15 AM
won3d said...: http://www.ibm.com/developerworks/java/library/j-jtp09263.html

People in my neck of the words have started to use terms like hostile, compatible, or safe w.r.t. threads.; February 20, 2009 at 10:43 AM
Thom said...: Wow, this is very close to how Rust's "compile time thread safety" traits are modeled (and this post predates Rust's by ~6y -- not that anything is ever actually novel in programming).

They aren't identical though, so you might find the differences interesting, even though I'm guessing you probably don't care about Rust all that much.

Rust does this on a per-type (not per-function) basis, and there are 2 thread-safety markers for a type:

1. `Send`, meaning a type can be safely sent to another thread.
2. `Sync`, meaning multiple threads can operate on the type in a thread-safe way without causing races. (This is mostly equivalent to the type's references being Send)

These are essentially independent, so there are 2 bits and 4 possible levels:

1. Both `Send` and `Sync`: This is a lot like CTS. In practice for Rust, turns out to be the case on most types, since Rust is a bastard about only letting you mutate stuff if you have exclusive access to something (it does has a number of escape hatches with various tradeoffs here).
2. `Send`, but not `Sync`: This mostly OTS. This is types that can be safely sent between threads, but wouldn't be safe to use from more than one thread at a time. (This generally happens if choose one of non-thread-safe options from the mutability escape hatches I mentioned — however, if you still own the data you can still send the type between threads)
3. `Sync`, but not `Send`: This is a weird one and pretty useless in practice. This type can be operated on from multiple threads, but only by reference. Stuff that cares what thread the dtor runs on (RAII guards for pthread locks), is probably the biggest example.
4. Neither `Send` nor `Sync`: This is usually just NTS. In Rust the most common example a non-atomic reference counted smart pointer, which require them all be on the same thread.

Anyway, an interesting thing is that when applied to types: The "Single-Thread-Safe" you note at the end can be expressed naturally. SPMC stuff in rust is done with a 2-part object, e.g. separate producer and consumer sides. Specifically, `new_scmp_queue()` might return a tuple of `(SpmcProducer, SpmcConsumer)` where only the `SpmcConsumer` is fully `Send + Sync`, but the producer is `Send + !Sync`, e.g. only usable on one thread at a time. This is very useful in practice.

That said, per-function like you describe seems more-or-less strictly better, since it's more fine-grained which is what you want. It just can't be enforced by a compiler, or at least not nearly as easily (Rust's rules just require restrictions on global variables, and by having functions that send stuff between threads only accept `Send` types (and that kinda thing), whereas to detect misuse on the function level, I think it would require the compiler to know what thread the code is running on...)

Anyway, super sorry for commenting on such an old post, and writing such a long comment about a language you probably don't care about. It just struck me how similar your levels are to how things work in Rust — aside from being per-function vs per-type and Rust having an additional level that doesn't matter in practice, it's essentially the same.

P.S. Thanks so much for all the writing this blog BTW, the archives have easily some of the best writing on concurrency anywhere on the internet (which are still relevant even so many years), and seem to cover almost everything.; January 8, 2021 at 6:43 AM
cbloom said...: It's definitely interesting to see how Rust and Go and such new languages are attacking these issues.

There's a lot of good ideas in what Joe Duffy pursued for C# and then Midori :

http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/; January 8, 2021 at 7:06 AM

cbloom rants

2/19/2009

02-19-09 - Thread Safety Levels

11 comments:

old rants