08-12-11 - The standard cinit trick

Sometimes I like to write down standard tricks that I believe are common knowledge but are rarely written down.

Say you have some file that does some "cinit" (C++ class constructors called before main) time work. A common example is like a factory that registers itself at cinit time.

The problem is if nobody directly calls anything in that file, it will get dropped by the linker. That is, if all uses are through the factory or function pointers or something like that, the linker doesn't know it gets called that way and so drops the whole thing out.

The standard solution is to put a reference to the file in its header. Something like this :

Example.cpp :

int example_cpp_force = 0;

AT_STARTUP( work I wanted to do );

Example.h :

extern int example_cpp_force;

AT_STARTUP( example_cpp_force = 1 );

where AT_STARTUP is just a helper that puts the code into a class so that it runs at cinit, it looks like this :

#define AT_STARTUP(some_code)   \
namespace { static struct STRING_JOIN(AtStartup_,__LINE__) { \
STRING_JOIN(AtStartup_,__LINE__)() { some_code; } } STRING_JOIN( NUMBERNAME(AtStartup_) , Instance ); };

Now Example.obj will be kept in the link if any file that includes Example.h is kept in the link.

This works so far as I know, but it's not really ideal (for one thing, if Example.h is included a lot, you get a whole mess of little functions doing example_cpp_force = 1 in your cinit). This is one of those dumb little problems that I wish the C standards people would pay more attention to. What we really want is a way within the code file to say "hey never drop this file from link, it has side effects", which you can do in certain compilers but not portably.


Tom said...

What is the anonymous namespace for?

I might actually steal this macro. It's less freakish than some of the other workarounds I've seen, presumably safer than doing nothing, and less annoying than calling all your init stuff from main (which is probably the only 100% reliable option :).

cbloom said...

"What is the anonymous namespace for?"

So that definitions of the same class in different files don't conflict. It makes the decorated name unique to the current translation unit.

Autodidactic Asphyxiation said...

Are linkers smart about laying out the code so that each pre-main init method doesn't cause a shiny new miss on startup?

I have some pretty mixed feelings on auto-registration. Now, I think it's not worth it. I guess, as a library author, manual init is a pain for your clients, who have better things to do than to become experts in your library.

Maybe a best-of-both-worlds is for the library to require manual init, and all the auto-registration actually happens in single optionally-static-linked file. That way, you can also deal with the static order initialization fiasco.

cbloom said...

As a library author at RAD I'm very opposed to auto-init. In fact I'm trying to avoid CINIT at all because it imposes constraints on the client that they might not want.

The C++ cinit / singleton system works just fine, but it's one of those things in C++ that only works if you drink the whole kool-aid and use good meta-language rules, which not all people do.

In my personal code I love auto init.

Tom Seddon said...

I'm still not convinced that the anonymous namespace is necessary (I'm the Tom above btw), because the constructor, being implicitly inline, can legally have multiple identical definitions.

The only conflict-type thing I can spot here is the possibility of __LINE__ creating two structs with the same name in the preprocessed output - but anonymous namespaces won't save you from that, because they're per translation unit rather than per file.

My test in Xcode seemed to bear out my suspicion but perhaps there's some subtlety I'm missing and/or I need to go and read the C++ standard again.

(Also it took me about 5 minutes to re-find-out, again, the magic behind a working STRING_JOIN. So perhaps I shouldn't trust my C++ judgement in the first place.)

cbloom said...

"I'm still not convinced that the anonymous namespace is necessary (I'm the Tom above btw), because the constructor, being implicitly inline, can legally have multiple identical definitions."

Yeah you may be right. Certainly in MSVC it seems to be legal to have identically named classes that do different things in different translation units.

BTW one niggle - the crucial thing is that they are *not* identical, the AT_STARTUPs in different files may do different things but have the same name. We don't want the linker collapsing them. Which it seems it doesn't.

In any case, I prefer to leave the anonymous namespace because it makes it clearer to me what's happening.

I strongly object to the school of thought that says you should remove syntax when it's unnecessary. I say you should add syntax when it makes it clearer what's happening. (eg. extra braces, extra parens, what have you).

old rants