12/17/2008

12-17-08 - Junk

tmpnam() and tmpfile() are making files in the root of my machine. According to the docs they're supposed to put files in your temp dir as set by "TMP" , but all I ever see is stuff like "\s38c.1" . The windows functions GetTempPath / GetTempFileName seem to work fine.


Is this legal ?


static int x = func(&x);

I did the Casey-style tweakable var thing where you watch your C files, and I want to be able to initialize a variable and also grab its address in just one statement. It works perfectly fine in MSVC, I just wonder if it will break on some platform; I can't seem to find anything in the C standard about whether the address of a variable is defined before the variable is finished initializing. (I know for example that you aren't "supposed" to use "this" during a constructor, but everybody does it and it works fine in every compiler I've ever seen).


It sucks that MSVC doesn't have fmemopen(). It would let me page data into a memory buffer and then fmemopen on that and give it back to clients who want to just read bits with stdio because they have functions already written that use stdio.

More generally, I wish stdio FILE had actually been defined to use function pointers the way it does in some GCC POSIX implementations . Then you could plug in anything and FILE would be a true virtual file. That way people could just write code to talk to stdio, and I could secretly pass them Oodle file handles and it would all just work.

Instead I have to make my own virtual file layer, and then maybe some #defines or something if you want to just stomp your stdio calls to Oodle.

The function pointer indirection is really not a performance cost, because getc will still be a macro that goes straight to a buffer, and the function pointer only needs to get called to fill the buffer. The stdio buffer these days is not actually a file system buffer - it's way too small, that's just not its job. The file system is buffering the disk in 256k chunks, stdio has a little 4k buffer whose purpose is to reduce the number of times you need to talk to the OS, just to cut down on function call overhead. I'm planning on using the same method for the Oodle virtual file, but with just a 256 byte or maybe 1k buffer, which is just there so you can do fast macro getc() and only rarely jump through a function pointer to do big reads or refill the little buffer.


I'm working on my laptop, which is way way slower than my awesome work machine. My Oodle test games runs at 10 fps on the laptop and 200 fps at work. It's letting me find lots of little paging bugs. Yikes. For real release testing I'm probably going to have to put in a mechanism to run at artificial slow framerates, and maybe also some randomized frame durations to really try to stress all possible orders of things occuring.


I'm annoying by having "Oodle" and "rad" in front of all my function names. It's nice to expose it to clients that way because it makes it so you never have possible conflicts with their definitions, but it's annoying for me during dev. It ruins typing autocomplete and browse info. I think "oh, I want to open this file" and start typing "ood.." and it gets me nothing, whereas I could just be typing "open.." and it would complete for me.

I might have to do the thing Casey did where I use nice short names internally for myself and then run it through a bunch of wrappers to expose long names to the outside world.

Back on Galaxy and all my old libraries, and back when I did C-style OOP at Eclipse I would always make the names of things show what system they're in. So all the Galaxy stuff is gFile, gVec, etc. I now realize that's bad. For one thing it's bad for the autocomplete and so on. It's bad for file name browsing. It's bad for pronouncability. But more importantly it doesn't help. In C++ you can just wrap all your junk in a namespace to prevent conflicts, and then within your namespace you can rock on with short names.

Furthermore, short generic names is better for metaprogramming and templates. If your functions are named Quat_Length() and Vec_Length() and so on, I can't write generic functions that work on both. But if it's just Length() and Distance() and such, I can write lovely generics. Even without templates this is a win because it lets you copy-paste code, or change your data types. Like say you have a chunk of code that works on Vec3. You decide it needs to work on Vec4 instead. If your functions are Vec3_Add and Vec4_Subtract then you have to change a ton of code. If it's just Add() and Subtract() you change the data types and it just works.

Algorithms are seperate from the data they work on. That's the big epiphany of Stepanov. In fact, Stepanov is a weirdo and thinks the most important thing is to find the absolutely minimal constraint that the algorithm places on the data it works on.

18 comments:

800poundgames said...

Why not just create a static C++ object? You can call your function in the ctor.

800poundgames said...

Also Windows supports memory mapped file IO. So although it's not nicely wrapped up for you, you should be able write the equivalent of fmemopen for MSVC.

Tom.

cbloom said...

"Also Windows supports memory mapped file IO. So although it's not nicely wrapped up for you, you should be able write the equivalent of fmemopen for MSVC."

Hmmm, I guess that is true. It's rather nasty. You'd have to make a stdio FILE from scratch (probably easiest to dup stdin or stdout), then you can poke into it and change the file pointer inside to a windows Handle (you can trace into the CRTL and see that the low level file descriptor is just a Windows Handle, of course).

I thought there might be a routine already to make a FILE from a HANDLE but I can't find one.

cbloom said...

"Why not just create a static C++ object? You can call your function in the ctor."

Because I want to transparently be able to make any variable tweakable (and not change any other code and have it work). To make it an object I'd have to change float to Float or probably Tweakable < float > . That is a path that would work but I really don't like the C++ object masquerading as a basic type paradigm.

800poundgames said...

struct TweakableInt { TweakableInt() { /* do stuff */ } operator int&() { return m_data; } int m_data; };

...

static int x = TweakableInt();


What's wrong with that???

800poundgames said...

Sorry that should be: static int& x = TweakableInt();

cbloom said...

Ahh, interesting, I hadn't thought of making it a reference. I'll have to look into that...

Ivan-Assen said...

Artificial slow framerates - very good idea, helps with those cases where something flicks the wrong way for a frame and you never notice it because you run on an empty test map which runs at 200 fps, and nobody can make a screenshot to demonstrate it for you even on real maps at 30 fps.

Autodidactic Asphyxiation said...

Doesn't TweakableInt() create a temporary that dies before you get to do anything with with x? I guess that's just a nit since you probably just want to put the "static" with the TweakableInt and not the int.

But IMO having static non-POD objects gets to be kind of a pain, although this seems like a harmless enough use.

cbloom said...

I mean, I can easily just do

static int x = val;
static Registrar temp(&x);

and that works fine and is straightforward, but creates two statics instead of one. I'm not sure that's a bad thing.

If you put statics in local scope, it does generate an "if initialized" check each time the scope is entered.

Autodidactic Asphyxiation said...

Right, this is one of the great things about C++, where "static" means different things in different scopes. Still, you sometimes have to be careful of where references escape to, since you don't have much control over destruction order, and this gets particularly hairy with threads. I don't think these would really be issues for Casey's tweak thing, mostly because your lifetimes are pretty simply defined.

But say you have some struct that is constructed via TweakInt::operator int. That struct isn't normally tweakable, so maybe you have it take a TweakInt&. Well, now the lifetime of that struct has to be within the lifetime of your tweak code, but you have to manage that all yourself unless you risk crash-on-exit problems. You can't rely on the "statics are destructed in the opposite order of creation" rule.

cbloom said...

Yeah, actually for all my Singletons now I do construct-on-first-use but I require manual calls to destruct. Like :

static Singleton * s_the = NULL;

static Singleton * Get() { if ( ! s_the ) s_the = new Singleton; return s_the; }

static void Release() { if ( s_the ) delete s_the; s_the = NULL; }

That is, I never use the older school approach of using local statics to initialize on first use that looked like this :

static Singleton & Get() { static Singleton s_the; return s_the; }

cbloom said...

A similar related thing I've been changing recently is getting rid of static const class members. I used to think it was cool to provide things like :

static const Vec3 c_zero(0,0,0);

or

static const String c_empty("");

but you have to be careful not to use them during CINIT because order of initializion is not defined, and of course if it's a string you have to worry about the allocations at startup and frees at shutdown.

It's a shame because const values like that is convenient and should be very efficient and easy to use.

Oh, I also *hate* *hate* the fact that MSVC shows static members in the debug view of classes. My freaking Color class looks like
Color { black={...} white={...} red={....} , .. } in the debugger which a nightmare. I know I can go and change the debugger expansion in that magic secret file, but it's a pain. That alone is almost enough reason to stop using static const members.

Autodidactic Asphyxiation said...

In C++, for things that don't require construction/destruction, you can define constants and POD objects as const in headers (const int, const char* const, const char[]) and they have internal linkage as opposed to class statics which have external linkage. C++ (and not C) won't give you multiply defined symbols, and generally just do the right thing.

I'm not at all sure what "static const String c_empty" is supposed to accomplish.

Sean said...

If you put statics in local scope, it does generate an "if initialized" check each time the scope is entered.

This isn't true in C, and I would have thought C++ would handle the cases with weirder initializers the same way it handles those at global scope. Are statics at local scope actually allowed to refer to locals when initializing, or something?

But more importantly it doesn't help. In C++ you can just wrap all your junk in a namespace to prevent conflicts, and then within your namespace you can rock on with short names.

I'm still waiting for you to explain "it doesn't help". Yes, the C++ namespacing accomplishes the same thing--and better, because it's easier to change if there is a conflict--but if you're writing bare C, it does help avoid conflicts in the obvious way. So I don't know what "it doesn't help" means.

Furthermore, short generic names is better for metaprogramming and templates. If your functions are named Quat_Length() and Vec_Length() and so on, I can't write generic functions that work on both. But if it's just Length() and Distance() and such, I can write lovely generics.

But, in the other direction, it makes code that much harder to debug; you can grep for "Quat_length" way more easily than you can grep for "Length" and find the right thing. Now you need to be in a running debugger or a smart IDE to be able to read the code. If you have total control over your choice of tools and nobody else needs to use your code, then maybe you can rely on a smart IDE, but if you're arguing for this as a practice for other people it becomes more complex. (In general I tend to prefer solutions that don't rely on IDE smarts for that reason.)

Also, for anyone else reading, if the only disadvantage to prefixing you care about is autocomplete-y things, you could try suffixing instead of prefixing.

Sean said...

I should have said "it makes unfamiliar code much harder to debug".

cbloom said...

"Are statics at local scope actually allowed to refer to locals when initializing, or something?"

Yeah, and they can call functions and such, it's crucial that they aren't constructed until first use, that's how the whole singleton pattern works.


"I'm still waiting for you to explain "it doesn't help""

Yeah, that was referring to C++; obviously in straight C it's crucial. In C++ I don't think putting the type in the name does much of anything good because you have the type in the arg info, and presumably you're using some kind of browser which knows about that. There is the issue of the "reverse find" which we discussed elsewhere.

"it makes unfamiliar code much harder to debug"

Personally I don't find that it makes it harder to debug, because if I'm in a debugger I can trace around, and that means I have browse info and so on. But you are right that if you're just *reading* unfamiliar code in a text editor, then the absolute minimum of fancy features does make it easier to follow. But that means -

no macros
no member functions
no virtual functions
no templates
no overloading
etc.

all of which can make it impossible to deduce code flow just by reading in a text editor.

I guess it comes down to how important you think it is to be able to trace through unfamiliar code by eye. I do believe that it's very important (and for that reason I try to avoid things like implicit constructors which create code flow that you cannot see at all by visual inspection).

It would be nice if you could take a snippet of C++ code and hit a button and get a nice easy to read plain C version to make it easier to read unfamiliar code. We talked to Jeff about this when he was looking at std::sort ; in theory you could use something like cfront but it just generates unreadable gunk.

castano said...

"I might have to do the thing Casey did where I use nice short names internally for myself and then run it through a bunch of wrappers to expose long names to the outside world."

This is also a good idea to prevent symbol conflicts on Unix systems. A library should never use its own public API.

Imagine you do:

CG_CREATEPROGRAM_PROC cgCreateProgram;
cgCreateProgram = dlsym(cg_handle, "cgCreateProgram");

If you declare cgCreateProgram as a global function pointer and libCg.so uses cgCreateProgram internally (not as a function pointer), then nasty things will happen.

I think that's why QuakeGL linked to OpenGL dynamically and prefixed the symbols with the qgl prefix. Some OpenGL implementations did not follow that rule.

old rants