6/28/2011

06-28-11 - String Extraction

So I wrote a little exe to extract static strings from code. It's very simple.

StringExtractor just scans some dirs of code and looks for a tag which encloses a string with parens. eg. :

    MAKE_STRING( BuildHuff )
it takes all the strings it finds in that way and makes a table of indexes and contents, like :


enum EStringExtractor
{
    eSE_Null = 0,
    eSE_BuildHuff = 1,
    eSE_DecodeOneQ = 2,
    eSE_DecodeOneQ_memcpy = 3,
    eSE_DecodeOneQ_memset = 4,
    eSE_DecodeOneQuantum = 5, ...


const char * c_strings_extracted[] = 
{
    0,
    "BuildHuff",
    "DecodeOneQ",
    "DecodeOneQ_memcpy",
    "DecodeOneQ_memset",
    "DecodeOneQuantum", ...

it outputs this to a generated .C and .H file, which you can then include in your project.

The key then is what MAKE_STRING means. There are various ways to set it up, depending on whether you are replacing an old system that uses char * everywhere or not. Basically you want to make a header that's something like :


#if DO_STRINGS_RAW

#define MAKE_STRING( label )   (string_type)( #label )
#define GET_STRING(  index )   (const char *)( index )

#else

#include "code_gen_strings.h"

#define MAKE_STRING( label )  (string_type)( eSE_ ## label )
#define GET_STRING(  index )  c_strings_extracted[ (int)( index ) ]

#endif

(string_type can either be const char * to replace an old system, or if you're doing this from scratch it's cleaner to make it a typedef).

If DO_STRINGS_RAW is on, you run with the strings in the code as normal. With DO_STRINGS_RAW off, all static strings in the code are replaced with indexes and the table lookup is used.

It's important to me that the code gen doesn't actually touch any of the original source files, it just makes a file on the side (I hate code gen that modifies source because it doesn't play nice with editors); it's also important to me that you can set DO_STRINGS_RAW and build just fine without the code gen (I hate code gen that is a necessary step in the build).

Now, why would you do this? Well, for one thing it's just cleaner to get all static strings in one place so you can see what they are, rather than having them scattered all over. But some real practical benefits :

You can make builds that don't have the string table; eg. for SPU or other low-memory console situations, you can run the string extraction to turn strings into indeces, but then just don't link the table in. Now they can send back indeces to the host and you can do the mapping there.

You can load the string table from a file rather than building it in. This makes it optional and also allows localization etc. (not a great way to do this though).

For final builds, if you are using these strings just for debug info, you can easily get rid of all of them in one place just by #defining MAKE_STRING and GET_STRING to nulls.

Anyhoo, here's the EXE :

stringextractor.zip (84k)

(stringextractor is also the first cblib app that uses my new standardized command line interface; all cblib apps in the future will have a common set of -- options; also almost all cblib apps now take either files or dirs on the command line and if you give them dirs they iterate on contents).

(stringextractor also importantly uses a helper to not change the output file if the contents don't change; this means that it doesn't mess up the modtime of the generated file and cause rebuilds that aren't necessary).

Obviously one disadvantage is you can't have spaces or other non-C-compatible characters in the string. But I guess you could fix this by using printf style codes and do printf when you generate the table.

6/24/2011

06-24-11 - Regression

Oodle now can run batch files and generate this :

test_cdep.donetest_huff.donetest_ioqueue.donetest_lzp.donetest_oodlelz.done
r:\test_done_xenon
r:\test_done_ps3passpasspasspass : 128.51pass : 450.50
r:\test_done_win32passpassfailpass : 341.94pass : 692.58
test_cdep.donetest_huff.donetest_ioqueue.donetest_lzp.donetest_oodlelz.done
r:\test_done_xenon
r:\test_done_ps3passpasspasspass : 128.03pass : 450.73
r:\test_done_win32passpasspasspass : 335.55pass : 686.90

Yay!

Something that's important for me is doing constant runs of the speeds of the optimized bits on all the platforms, because it's so easy to break the optimization with an inoccuous check-in, and then you're left trying to find what slowed you down.

Two niggles continue to annoy me :

1. Damn Xenon doesn't have a command line interface (by which I mean you can't actually interact with running programs from a console; you can start programs; the specific problem is that you can't tell if a program is done or still running or crashed from a console). I have my own hacky work-around for this which is functional but not ideal. (I know I could write my own nice xenon-runner with the Dm API's but I haven't bitten that off yet).

2. Damn PS3TM doesn't provide "force connect" from the command line. They provide most of the commands as command line switches, but not the one that I actually want. Because of this I frequently have problems connecting to the PS3 during the regression run, and I have to open up the damn GUI and do it by hand. This is in fact the only step that I can't automate and that's annoying. I mean, why do they even fucking bother with providing the "connect" and "disconnect" options? They never fucking work, the only thing that works is "force disconnect". Don't give me options that just don't work dammit.

(and the whole idea of PS3TM playing nice and not disconnecting other users is retarded because it doesn't disconnect idle people, so when someone else is "connected" that usually means they were debugging two days ago and just never disconnected)

(there is a similar problem with the Xenon (similar in the sense that it can't be automated); it likes to get itself into a state where it needs to be cold-booted by physically turning it off; I'm not sure why the "cold boot" xbreboot is not the same as physically power cycling it, so that's mildly annoying too).

6/23/2011

06-23-11 - Map File Graphviz

What I want :

Something that parses the .map and obj's and creates a graph of the size of the executable. Graph nodes should be the size they take in the exe, and connections should be dependencies.

I have all this code for generating graphviz/dotty because I do it for my allocation grapher, but I don't know a good way to get the dependencies in the exe. Getting the sizes of things in the MAP is relatively easy.

To be clear, what you want to see is something like :

s_log_buf is 1024 bytes
s_log_buf is used by Log() in Log.cpp
Log() is called from X and Y and Z
...
just looking at the .map file is not enough, you want to know why a certain symbol got dragged in. (this happens a lot, for example some CRT function like strcpy suddenly shows up in your map you're like "where the fuck did that come from?")

Basically I need the static call-graph or link tables , a list of the dependencies from each function. The main place I need this is on the SPU, because it's a constant battle to keep your image size minimal to fit in the 256k.

I guess I can get it from "objdump" pretty easily, but that only provides dependency info at the obj level, not at the function level, which is what I really want.

Any better solutions?

6/21/2011

06-21-11 - Houses

Well I put an offer on a house, but it doesn't look like I'll get it. The multiple-offer scenario is very strange, I always thought it was like an actual auction, like I put in an offer, they tell me what the highest other offer is and I get a chance to beat it. Not so. The seller can basically tell you anything they want to manipulate you, which is a real turn off to me and makes me just want to walk away. (in general in life I'm horrible at dealing with interactions and problems with people - my only weapon is to walk away). This seller basically said "you need to offer a lot more" but wouldn't say what the actual other offers were, so are they just trying to trick me into offering too much? Fuck that.

It's difficult keeping focused on the house hunt. We put a ton of time into looking at that house; I think I visited it 5 times to check out the neighbors and neighborhood at different times of day (you have to try to random sample for dogs yapping, children playing drums, retired people doing amateur construction projects, etc.). It's so much work that it's probably +EV for me to actually get a "bad deal" on the house. I think I might be too worried about making a good investment and could make a mistake of investing too much time searching and not getting what I really want.

Anyway, writing down some things that I've been mulling.

Houses are More Stress than Renting

I've heard this from a lot of home owners, and I'm sure it's true, but it's also largely just in your head. For example, home owners stress about the value of their home when the market goes up and down. But you don't have to, you can just not worry about that and just think of your mortgage as rent. It's not actually affecting you in any way day to day. Similarly, home owners stress about doing maintenance and home improvements; they always have a big todo list of stuff to do to the house (fix the squeaky door, find the basement leak, replace the roof shingles, etc.) , but you don't have to do those things. I've lived in a lot of rental houses over the last 15 years, and I have never seen a home owner do maintenance on *any* of them *ever*. Not even things like pruning or gutter cleaning that should be done annually. So clearly houses hold up just fine if you do no maintenace at all. The reason rentals are relaxing is because you see problems with the house any you just think "meh, not my problem". But you could treat the home you own the same way. I'm a total uptight type-A so this is a major trap for me that I have to avoid. This relates to ...

Priorities

It's one of the sad/funny truths of life that most people actively make their own lives worse. Maybe you live in the city in a shitty apartment, or even in a group house, it's uncomfortable, it's noisey, but you hang out with your friends, you're surrounded by the vitality of the city, you're actually very happy. But when you get money, you go buy a comfortable suburban house. Now you're far away from your friends and you don't see them any more, you have to drive everywhere, you spend your time mowing the lawn and watching TV and home improving, you're more comfortable, but your life is actually much worse.

Most people are horrible decision makers when it comes to their own happiness. Humans tend to prioritize elimination of discomfort, and in reality that is not actually important to happiness. (for example, stupid people spend their time buying the newest fancy hiking gear so they can be comfortable if they ever actually go, smart people just don't care and go hiking in the rain anyway, it may be uncomfortable, but afterward you forget the discomfort and have only the happiness).

I think that in general, home buyers over-value the actual house. Whether it's cute, whether it's in good condition, etc. These things don't actually matter that much to your quality of life. Oh no, I don't have cute decorative wood trim in my living room, my life sucks. No, actually these are probably the least important things.

Perhaps the most under-valued thing of all is picking a location near your family and friends. People always think "I don't want to sacrifice picking the ideal house for this because they might move anyway" or "we can still hang out even if I live further away" , but in fact you won't still hang out, and that is a major quality of life loss.

I am susceptible to this as anyone; the problem is you see a charming beautiful house and it fills you with visions of how good your life would be there, but in reality that is just an illusion (it's sort of like marrying a beautiful woman - in the long run, the beauty is not the important thing to quality of life, it's how you get a long, if she's understanding and compassionate and reasonable and fun, etc.)

I've been trying to figure out what actually matters to me. It's something like this (not in order) :


1. Bedroom where you can open the windows and it's not super noisey

2. No neighbor problems ; ideally see as little of the neighbors as possible ; no apartments or home
improvers.

3. Some private space where I can be naked in the sunshine

4. Some garage space where I can work on my car and bikes in peace

5. Sunny land for a garden

6. Walkable to groceries (about 0.5 miles max)

7. Good room for a home office ; has to be reasonably isolated from rest of the house

8. Not intolerable commute to kirkland ; even though I only do it two days a week, it does create a massive
amount of anger

Decision Making

I find making these kind of large extended decisions very difficult. The problem is you sort of go off chasing lines of thought and don't reset.

You start with certain criteria in mind. Then something comes up and it makes you think "okay maybe I should consider more expensive places too", so now you have to consider them and weigh that in too. Then something else comes up and you chase another thread, then you visit one house and it has a nice out-building and you have to weigh that against the original factors.

Your mind is getting more and more cluttered with all these cases and how to weigh them against each other; you no longer have a fresh perspective to think "do I really want this?"

One solution is to give yourself strict rules at the start (X dollar range, Y location range, Z square foot range) and absolutely stick to them, because as you get more and more confused down the line you will be tempted to violate those ranges and you might make a mistake simply because your decision making capacity has got shot.

Value

How do you get good value from a large purchase? (for most people this is just a car or a house)

Well some people can do it by smooth talking or manipulating the seller. I'm never going to succeed at that so let's talk about other ways.

One is to find a desperate seller. Sometimes you just get lucky with your timing and find a seller who needs to move right now and will take a low price. Basically you just go around to lots of sellers and give them low-ball offers, and eventually someone will bite. You can't be too picky about what you get this way, though, because the chance of finding a desperate seller in the house you want is very low. (it's easier with cars than houses since there are lots of identical cars of a given model).

Probably the best way, though, is to find properties where the market's valuation doesn't match your own valuation. Of course that sucks for resale value, but if you just want to maximize the value to you, your best bet is to look for ways that your personal valuation mismatches the market. A few for me :

I hate finished basements, but they count as square footage, so houses with finished basements
are massively overvalued for me.

I super-value some privacy in the yard.

I don't care about square footage of the house that much in particular, so large houses are over-valued
and small houses are under-valued.  A lot of people look at $/sq-ft way too much.

Blah blah blah I'm bored of this topic.

6/17/2011

06-17-11 - C casting is the devil

C-style casting is super dangerous, as you know, but how can you do better?

There are various situations where I need to cast just to make the compiler happy that aren't actually operational casts. That is, if I was writing ASM there would be no cast there. For example something like :

U16 * p;
p[0] = 1;
U8 * p2 = (U8 *)p;
p2[1] = 7;
is a cast that changes the behavior of the pointer (eg. "operational"). But, something like :
U16 * p;
*p = 1;
U8 * p2 = (U8 *)p;
p2 += step;
p = (U16 *) p2;
*p = 2;
is not really a functional cast, but I have to do it because I want to increment the pointer by some step in bytes, and there's no way to express that in C without a cast.

Any time I see a C-style cast in code I think "that's a bug waiting to happen" and I want to avoid it. So let's look at some ways to do that.

1. Well, since we did this as an example already, we can hide those casts with something like ByteStepPointer :


template<typename T>
T * ByteStepPointer(T * ptr, ptrdiff_t step)
{
    return (T *)( ((intptr_t)ptr) + step );
}

our goal here is to hide the nasty dangerous casts from the code we write every day, and bundle it into little utility functions where it's clear what the purpose of the cast is. So now we can write out example as :
U16 * p;
*p = 1;
p = ByteStepPointer(p,step);
*p = 2;
which is much prettier and also much safer.

2. The fact that "void *" in C++ doesn't cast to arbitrary pointers the way it does in C is really fucking annoying. It means there is no "generic memory location" type. I've been experimenting with making the casts in and out of void explicit :


template<typename T>
T * CastVoid(void * ptr)
{
    return (T *)( ptr );
}

template<typename T>
void * VoidCast(T * ptr)
{
    return (void *)( ptr );
}

but it sucks that it's so verbose. In C++0x you can do this neater because you can template specialize based on the left-hand-side. So in current C++ you have to write
Actor * a = CastVoid<Actor>( memory );
but in 0x you will be able to write just
Actor * a = CastVoid( memory );

There are a few cases where you need this, one is to call basic utils like malloc or memset - it's not useful to make the cast clear in this case because the fact that I'm calling memset is clear enough that I'm treating this pointer as untyped memory; another is if you have some generic "void *" payload in a node or message.

Again you don't want just a play C-style cast here, for example something like :

Actor * a = (Actor *) node->data;
is a bug waiting to happen if you change "data" to an int (among other things).

3. A common annoying case is having to cast signed/unsigned. It should be obvious that when I write :

U32 set = blah;
U32 mask = set & (-set);
that I want the "-" operator to act as (~set + 1) on the bits and I don't care that it's unsigned, but C won't let you do that. (see previous rants about how what I really want in this scenario is a "#pragma requires(twos_complement)" ; warning me about the sign is fucking useless for portability because it just makes me cast, if you want to make a real portable language you have to be able to express capabilities of the platform and constraints of the algorithm).

So, usually what you want is a cast that gives you the signed type of the same register size, and that doesn't exist. So I made my own :


static inline S8  Signed(U8 x)  { return (S8) x; }
static inline S16 Signed(U16 x) { return (S16) x; }
static inline S32 Signed(U32 x) { return (S32) x; }
static inline S64 Signed(U64 x) { return (S64) x; }

static inline U8  Unsigned(S8 x)  { return (U8) x; }
static inline U16 Unsigned(S16 x) { return (U16) x; }
static inline U32 Unsigned(S32 x) { return (U32) x; }
static inline U64 Unsigned(S64 x) { return (U64) x; }

So for example, this code :
mask = set & (-(S32)set);
is a bug waiting to happen if you switch to 64-bit sets. But this :
mask = set & (-Signed(set));
is robust. (well, robust if you include a compiler assert that you're 2's complement)

4. Probably the most common case is where you "know" a value is small and need to put it in a smaller type. eg.

int x = 7;
U8 small = (U8) x;
But all integer-size-change casts are super unsafe, because you can later change the code such that x doesn't fit in "small" anymore.

(often you were just wrong or lazy about "knowing" that the value fit in the smaller type. One of the most common cases for this right now is putting file sizes and memory sizes into 32-bit ints. Lots of people get annoying compiler warnings about that and think "oh, I know this is less than 2 GB so I'll just C-style cast". Oh no, that is a huge maintenance nightmare. In two years you try to run on a larger file and suddenly you have bugs all over and you can't find them because you used C-style casts. Start checking your casts!).

You can do this with a template thusly :


// check_value_cast just does a static_cast and makes sure you didn't wreck the value
template <typename t_to, typename t_fm>
t_to check_cast( const t_fm & from )
{
    t_to to = static_cast<t_to>(from);
    ASSERT( static_cast<t_fm>(to) == from );
    return to;
}

but it is so common that I find the template a bit excessively verbose (again C++0x with LHS specialization would help, you could then write just :
small = check( x );

small = clamp( x );
which is much nicer).

To do clamp casts with a template is difficult. You can use std::numeric_limits to get the ranges of the dest type :

template <typename t_to, typename t_fm>
t_to clamp_cast( const t_fm & from )
{
    t_to lo = std::numeric_limits<t_to>::min();
    t_to hi = std::numeric_limits<t_to>::max();
    if ( from < lo ) return lo; // !
    if ( from > hi ) return hi; // !
    t_to to = static_cast<t_to>(from);
    RR_ASSERT( static_cast<t_fm>(to) == from ); 
    return to;
}
however, the compares inherent (at !) in clamping are problematic, for example if you're trying to clamp_cast from signed to unsigned you may get warnings there (you can also get the unsigned compare against zero warning when lo is 0). (? is there a nice solution to this ? you want to cast to the larger ranger of the two types for the purpose of the compare, so you could make some template helpers that do the compare in the wider of the two types, but that seems a right mess).

Rather than try to fix all that I just use non-template versions for our basic types :


static inline U8 S32ToU8Clamp(S32 i)    { return (U8) CLAMP(i,0,0xFF); }
static inline U8 S32ToU8Check(S32 i)    { ASSERT( i == (S32)S32ToU8Clamp(i) ); return (U8)i; }

static inline U16 S32ToU16Clamp(S32 i)  { return (U16) CLAMP(i,0,0xFFFF); }
static inline U16 S32ToU16Check(S32 i)  { ASSERT( i == (S32)S32ToU16Clamp(i) ); return (U16)i; }

static inline U32 S64ToU32Clamp(S64 i)  { return (U32) CLAMP(i,0,0xFFFFFFFFUL); }
static inline U32 S64ToU32Check(S64 i)  { ASSERT( i == (S64)S64ToU32Clamp(i) ); return (U32)i; }

static inline U8 U32ToU8Clamp(U32 i)    { return (U8) CLAMP(i,0,0xFF); }
static inline U8 U32ToU8Check(U32 i)    { ASSERT( i == (U32)U32ToU8Clamp(i) ); return (U8)i; }

static inline U16 U32ToU16Clamp(U32 i)  { return (U16) CLAMP(i,0,0xFFFF); }
static inline U16 U32ToU16Check(U32 i)  { ASSERT( i == (U32)U32ToU16Clamp(i) ); return (U16)i; }

static inline U32 U64ToU32Clamp(U64 i)  { return (U32) CLAMP(i,0,0xFFFFFFFFUL); }
static inline U32 U64ToU32Check(U64 i)  { ASSERT( i == (U64)U64ToU32Clamp(i) ); return (U32)i; }

static inline S32 U64ToS32Check(U64 i)  { S32 ret = (S32)i; ASSERT( (U64)ret == i ); return ret; }
static inline S32 S64ToS32Check(S64 i)  { S32 ret = (S32)i; ASSERT( (S64)ret == i ); return ret; }

which is sort of marginally okay. Maybe it would be nicer if I left off the type it was casting from in the name.

6/16/2011

06-16-11 - Optimal Halve for Doubling Filter

I've touched on this topic several times in the past . I'm going to wrap up a loose end.

Say you have some given linear doubling filter (linear in the operator sense, not that it's a line). You wish to halve your image in the best way such that the round trip has minimum error.

For a given discrete doubling filter (non-interpolating) find the optimal halving filter that minimizes L2 error. I did it numerically, not analytically, and measured the actual error of down->up vs. original on a large test set.

I generated halving filters for half-widths of 3,4, and 5. Large filters always produce lower error, but also more ringing, so you may not want the largest width halving filter.


upfilter :  linear  :
const float c_filter[4] = { 0.12500, 0.37500, 0.37500, 0.12500 };

 downFilter : 
const float c_filter[6] = { -0.15431, 0.00162, 0.65269, 0.65269, 0.00162, -0.15431 };
fit err = 17549.328

 downFilter : 
const float c_filter[8] = { 0.05429, -0.21038, -0.01115, 0.66724, 0.66724, -0.01115, -0.21038, 0.05429 };
fit err = 17238.310

 downFilter : 
const float c_filter[10] = { 0.05159, 0.00138, -0.21656, -0.00044, 0.66402, 0.66402, -0.00044, -0.21656, 0.00138, 0.05159 };
fit err = 16959.596

upfilter :  mitchell1  :
const float c_filter[8] = { -0.00738, -0.01172, 0.12804, 0.39106, 0.39106, 0.12804, -0.01172, -0.00738 };

 downFilter : 
const float c_filter[6] = { -0.13475, 0.02119, 0.61356, 0.61356, 0.02119, -0.13475 };
fit err = 17496.548

 downFilter : 
const float c_filter[8] = { 0.05595, -0.19268, 0.00985, 0.62688, 0.62688, 0.00985, -0.19268, 0.05595 };
fit err = 17131.069

 downFilter : 
const float c_filter[10] = { 0.05239, 0.00209, -0.19664, 0.01838, 0.62379, 0.62379, 0.01838, -0.19664, 0.00209, 0.05239 };
fit err = 16811.168

upfilter :  lanczos4  :
const float c_filter[8] = { -0.00886, -0.04194, 0.11650, 0.43430, 0.43430, 0.11650, -0.04194, -0.00886 };

 downFilter : 
const float c_filter[6] = { -0.09637, 0.05186, 0.54451, 0.54451, 0.05186, -0.09637 };
fit err = 17332.452

 downFilter : 
const float c_filter[8] = { 0.04290, -0.14122, 0.04980, 0.54852, 0.54852, 0.04980, -0.14122, 0.04290 };
fit err = 17054.006

 downFilter : 
const float c_filter[10] = { 0.03596, 0.00584, -0.13995, 0.05130, 0.54685, 0.54685, 0.05130, -0.13995, 0.00584, 0.03596 };
fit err = 16863.054

upfilter :  lanczos5  :
const float c_filter[10] = { 0.00551, -0.02384, -0.05777, 0.12982, 0.44628, 0.44628, 0.12982, -0.05777, -0.02384, 0.00551 };

 downFilter : 
const float c_filter[6] = { -0.08614, 0.07057, 0.51557, 0.51557, 0.07057, -0.08614 };
fit err = 17323.692

 downFilter : 
const float c_filter[8] = { 0.05112, -0.13959, 0.06782, 0.52065, 0.52065, 0.06782, -0.13959, 0.05112 };
fit err = 16899.712

 downFilter : 
const float c_filter[10] = { 0.04554, 0.00403, -0.13655, 0.06840, 0.51857, 0.51857, 0.06840, -0.13655, 0.00403, 0.04554 };
fit err = 16566.352

------------------------------

6/14/2011

06-14-11 - ProcessSuicide

The god damn lagarith DLL has some crash in its shutdown, so any time I play an AVI with app that uses lagarith, it hangs on exit.

(this is one of the reasons that I need to write my own lossless video format; the other reason is that lagarith can't play back at 30 fps even on ridiculously fast modern machines; and the other standard HuffYUV frequently crashes for me and is very hard to make support RGB correctly)

Anyhoo, I started using this to shut down my app, which doesn't have the stupid "wait forever for hung DLL's to unload" problem :


void ProcessSuicide()
{
    DWORD myPID = GetCurrentProcessId();

    lprintf("ProcessSuicide PID : %d\n",myPID);

    HANDLE hProcess = OpenProcess (PROCESS_ALL_ACCESS, FALSE, myPID); 
        
    if ( hProcess == INVALID_HANDLE_VALUE )
    {
        lprintf("Couldn't open my own process!\n");
        // ?? should probably do something else here, but never happens
        return;
    }
        
    TerminateProcess(hProcess,0);
    
    // ... ?? ... should not get here
    ASSERT(false);
    
    CloseHandle (hProcess);
}

At first I thought this was a horrible hack, but I've been using it for months now and it doesn't cause any problems, so I'm sort of tempted to call it not a hack but rather just a nice way to quit your app in Windows and not ever get that stupid thing where an app hangs in shutdown (which is a common problem for big apps like MSDev and Firefox).

06-14-11 - How to do input for video games

1. Read all input in one spot. Do not scatter input reading all over the game. Read it into global state which then applies for the time slice of the current frame. The rest of the game code can then ask "is this key down" or "was this pressed" and it just checks the cached state, not the hardware.

2. Respond to input immediately. Generally what that means is you should have a linear sequence of events that is something like this :

Poll input
Do actions triggered by input (eg. fire bullets)
Do time evolution of player-action objects (eg. move bullets)
Do environment responses (eg. did bullets hit monsters?)
Render frame
(* see later)

3. On a PC you have to deal with the issue of losing focus, or pausing and resuming. This is pretty easy to get correct if you obeyed #1 - read all your input in one spot, it just zeros the input state while you are out of focus. The best way to resume is when you regain focus you immediately query all your input channels to wipe any "new key down" flags, but just discard all the results. I find a lot of badly written apps that either lose the first real key press, or incorrectly respond to previous app's keys when they didn't have focus.

( For example I have keys like ctrl-alt-q that toggle focus around for me, and badly written apps will respond to that "q" as if it were for them, because they just ask for the global "new key down" state and they see a Q that wasn't there the last time they checked. )

4. Use a remapping/abstraction layer. Don't put actual physical button/keys all around your app. Even if you are sure that you don't want to provide remapping, do it anyway, because it's useful for you as a developer. That is, in your player shooting code don't write

  if ( NewButtonDown(X) ) ...
instead write
  if ( NewButtonDown(BUTTON_SHOOT) ) ...
and have a layer that remaps BUTTON_SHOOT to a physical key. The remap can also do things like taps vs holds, combos, sequences, etc. so all that is hidden from the higher level and you are free to easily change it at a later date.

This is obvious for real games, but it's true even for test apps, because you can use the remapping layer to log your key operations and provide help and such.

(*) extra note on frame order processing.

I believe there are two okay frame sequences and I'm not sure there's a strong argument in one way or the other :


Method 1 :

Time evolve all non-player game objects
Prepare draw buffers for non-player game objects
Get input
Player responds to input
Player-actions interact with world
Prepare draw buffers for player & stuff just made
Kick render buffers

Method 2 :

Get input
Player responds to input
Player-actions interact with world
Time evolve all non-player game objects
Prepare draw buffers for player & stuff just made
Prepare draw buffers for non-player game objects
Kick render buffers

The advantage of Method 1 is that the time between "get input" and "kick render" is absolutely minimized (it's reduced by the amount of time that it takes you to process the non-player world), so if you press a button that makes an explosion, you see it as soon as possible. The disadvantage is that the monsters you are shooting have moved before you do input. But, there's actually a bunch of latency between "kick render" and getting to your eye anyway, so the monsters are *always* ahead of where you think they are, so I think Method 1 is preferrable. Another disadvantage of Method 1 is that the monsters essentially "get the jump on you" eg. if they are swinging a club at you, they get to do that before your "block" button reaction is processed. This could be fixed by doing something like :

Method 3 :

Time evolve all non-player game objects (except interactions with player)
Prepare draw buffers for non-player game objects
Get input
Player responds to input
Player-actions interact with world
Non-player objects interact with player
Prepare draw buffers for player & stuff just made
Kick render buffers

this is very intentionally not "fair" between the player and the rest of the world, we want the player to basically win the initiative roll all the time.

Some game devs have this silly idea that all the physics needs to be time-evolved in one atomic step which is absurd. You can of course time evolve all the non-player stuff first to get that done with, and then evolve the player next.

06-14-11 - A simple allocator

You want to be able to allocate slots, free slots, and iterate on the allocated slot indexes. In particular :


int AllocateSlot( allocator );
void FreeSlot( allocator , int slot );
int GetNextSlot( iterator );

Say you can limit the maximum number of allocations to 32 or 64, then obviously you should use bit operations. But you also want to avoid variable shifts. What do you do ?

Something like this :


static int BottomBitIndex( register U32 val )
{
    ASSERT( val != 0 );
    #ifdef _MSC_VER
    unsigned long b = 0;
    _BitScanForward( &b, val );
    return (int)b;
    #elif defined(__GNUC__)
    return __builtin_ctz(val); // ctz , not clz
    #else
    #error need bottom bit index
    #endif
}

int __forceinline AllocateSlot( U32 & set )
{
    U32 inverse = ~set;
    ASSERT( inverse != 0 ); // no slots free!
    int index = BottomBitIndex(inverse);
    U32 mask = inverse & (-inverse);
    ASSERT( mask == (1UL<<index) );
    set |= mask;
    return index;
}

void __forceinline FreeSlot( U32 & set, int slot )
{
    ASSERT( set & (1UL<<slot) );
    set ^= (1UL<<slot);
}

int __forceinline GetNextSlot( U32 & set )
{
    ASSERT( set != 0 );
    int slot = BottomBitIndex(set);
    // turn off bottom bit :
    set = set & (set-1);
    return slot;
}

/*

// example iteration :

    U32 cur = set;
    while(cur)
    {
        int i = GetNextSlot(cur);
        lprintfvar(i);
    }

*/

However, this uses the bottom bit index, which is not as portably fast as using the top bit index (aka count leading zeros). (there are some platforms/gcc versions where builtin_ctz does not exist at all, and others where it exists but is not fast because there's no direct instruction set correspondence).

So, the straightforward version that uses shifts and clz is probably better in practice.

ADDENDUM : Duh, version of same using only TopBitIndex and no variable shifts :


U32 __forceinline AllocateSlotMask( U32 & set )
{
    ASSERT( (set+1) != 0 ); // no slots free!
    U32 mask = (~set) & (set+1); // find lowest off bit
    set |= mask;
    return mask;
}

void __forceinline FreeSlotMask( U32 & set, U32 mask )
{
    ASSERT( set & mask );
    set ^= mask;
}

U32 __forceinline GetNextSlotMask( U32 & set )
{
    ASSERT( set != 0 ); // iteration over when set == 0
    U32 mask = set & (-set); // lowest on bit
    set ^= mask;
    return mask;
}

int __forceinline MaskToSlot( U32 mask )
{
    int slot = TopBitIndex(mask);
    ASSERT( mask == (1UL<<slot) );
    return slot;
}

(note the forceinline is important because the use of actual references is catastrophic on many platforms (due to LHS), we need these to get compiled like macros).

6/11/2011

06-11-11 - God damn YUV

So I've noticed for the last year or so that x264 videos I was making as test/reference all had weirdly shifted brightness values. I couldn't figure out why exactly and forgot about it.

Yesterday I finally adapted my Y4M converter (which does AVI <-> Yuv4MPEG with RGB <-> YUV color conversion and up/down sample, and uses various good methods of YUV, such as out of gamut chroma spill, lsqr optimized conversion, etc.). I added support for the "rec601" (JPEG) and "bt709" (HDTV) versions of YUV (and by "YUV" I mean YCbCr in gamma-encoded space), with both 0-255 and 16-235 range support. I figured I would stress test it by trying to use it in place of ffmpeg in my h264 pipeline for the Y4M conversion. And I found the old brightness problem.

It turns out that when I make an x264 encode and then play it back through DirectShow (with ffdshow), the player is using the "BT 709" yuv matrix (in 16-235 range) (*). When I use MPlayer to play it back and write out frames, it's using the "rec 601" yuv matrix (in 16-235 range).

(*
this appears to be because there's nothing specified in the stream and ffdshow will pick the matrix based on the resolution of the video - so that will super fuck you, depending on the size of the video you need to pick a different matrix (it's trying to do the right thing for HDTV vs SDTV standard video). Their heuristic is :.

width > 1024 or height >= 600: BT.709
width <=1024 and height < 600: BT.601
*)

(in theory x264 doesn't do anything to the YUV planes - I provide it y4m, and it just works on yuv as bytes that it doesn't know anything about; the problem is the decoders which are doing their own thing).

The way I'm doing it now is I make the Y4M myself in rec601 space, let x264 encode it, then extract frames with mplayer (which seems to always use 601 regardless of resolution). If there was a way to get the Y4M directly out of x264 that would make it much easier because I could just do my own yuv->rgb (the only way I've found to do this is to use ffmpeg raw output).

Unfortunately Y4M itself doesn't seem to have any standardized tag to indicate what kind of yuv data is in the container. I've made up my own ; I write an xtag that contains :


yuv=pc.601
yuv=pc.709
yuv=bt.601
yuv=bt.709

where "bt" implies 16-235 luma (16-240 chroma) and "pc" implies 0-255 (fullrange).

x264 has a bunch of --colormatrix options to tag the color space in the H264 stream, but apparently many players don't respect it, so the recommended practice is to use the color space that matches your resolution (eg. 709 for HD and 601 for SD). (the --colormatrix options you want are bt709 and bt470bg , I believe).

Some notes by other people :


TV capture "SD" mpeg2 720x576i -> same res in mpe4, so use --colormatrix bt601 --fullrange ?
TV capture "HD" mpeg2 1440x1080i -> same res in mpe4, so use --colormatrix bt709 --fullrange ?

look at table E-3 (Colour Primaries) in the H.264 spec:

bt470bg = bt601 625 = bt1358 625 = bt1700 625 (PAL/SECAM)
smpte170m = bt601 525 = bt1358 525 = bt1700 NTSC

(yes, PAL and NTSC have different bt601 matrices here)

yup there's only:
--colormatrix <string> Specify color matrix setting ["undef"]
- undef, bt709, fcc, bt470bg, smpte170m, smpte240m, GBR, YCgCo

ADDENDUM : god damn the color matrix change in bt.709 is so retarded. While in theory the phosphors of HDTVs match 709 better than 601, that is actually pretty irrelevant, since YCbCr is run in gamma-corrected space, and we do the chroma sub-sample, and so on ( see Mag of nonconst luminance error - Charles Poynton ). The actual practical effect of the 709 new matrix is that we're watching lots of videos with badly shifted brightness and saturation. In reality, it just made video quality much much worse.

(I also don't understand the 16-235 range that was used in MPEG. Yeah yeah, NTSC needs the top and bottom of the signal for special codes, fine, but why does that have to be hard-coded into the digital signal? The special region at top and bottom is an *analog* thing. The video could have been full range 0-255 in the digital encoding, and then in the DAC output you just squish it into the middle 7/8 of the signal band. Maybe there's something going on that I don't understand, but it just seems like terrible software engineering design to take the weird quirk of one system (NTSC analog output) and push that quirk back up the pipeline to affect something (digital encoding format) that it doesn't need to).

6/08/2011

06-08-11 - Tech Todos

Publicly getting my thoughts together :

1. Oodle. Just finish it! God damn it.

2. JPEG decoder. I got really close to having this done, need to finish it. The main thing left that I want to do is work on the edge-adaptive-bilteral filter a bit more; currently it's a bit too strong on the non-artifact areas, I think I can make it more selective about only working on the ringing and blockiness. The other things I want are chroma-from-luma support and a special mode for text/graphics.

3. Byte-wise LZ with optimal parse. This has been on my list for a long time. I'm not really super motivated though. But it goes along with -

4. LZ string matcher test. Hash tables, hash->list, hash->bintree, hash->MMC, suffix trees, suffix arrays, patricia tries, etc. ? Would be nice to make a thorough test bed for this. (would also help the Oodle LZ encoder which is currently a bit slow due to me not spending any time on the string matcher).

5. Cuckoo hash / cache aware hash ; I did a bunch of hash testing a while ago and want to add this to my tests. I'm very curious about it, but this is kind of pointless.

6. Image doubler / denoiser / etc ; mmm meh I've lost my motivation for this. It's a big project and I have too many other things to do.

7. Make an indy game. Sometimes I get the craving to do something interactive/artistic. I miss being able to play with my work. (I also get the craving to make a "demo" which would be fun and is rather less pain in the butt than making a full game). Anyhoo, probably not gonna happen, since there's just not enough time for this.

ADDENDUM : some I forgot :

8. Finish my video codec ; I still want to redo my back end coder which was really never intented for video; maybe support multiple sizes of blocks; try some more perceptual metrics for encoder decision making; various other ideas.

9. New lossy image codec ; I have an unfinished one that I did for RAD, but I think I should just scrap it and do the new one. I'm interested in directional DCT. Also simple highly asymetric schemes, such as static classes that are encoder-optimized (instead of adaptive models; adaptive models are very bad for LHS). More generally, I have some ideas about trying to make a codec that is more explicitly perceptual, it might be terrible in rmse, but look better to the human eye; one part of that is using the imdiff metrics I trained earlier, another part is block classification (smooth,edge,detail) and special coders per class.

06-08-11 - Pots 4

Last bunch for a while since I'm taking the summer off.

Large round bowl; rim is oribe, but not just applied over base glaze, I left some space bare at the rim so it would get better grip. The dark pattern on the side is kind of interesting. I dug grooves by chattering with a cutting tool, then applied an iron slip to the outside of the pot, then after it dried sanded away the slip. The result was just iron left in the chatter holes. Then the pot is glazed with yellow salt, which reacts with iron by darkening. So the outside is smooth but shows the cut grooves as dark spots.

Vase with triangular opening; Oribe glaze, did something weird that I didn't do on purpose at all, not sure how this happened, and I love that unpredictability :

Cylindrical vase ; Shino top oribe bottom ; top is dipped, then I poured on some extra layers of shino that create the white patterns where it's thick. The middle band is cobalt stain which I waxed over before glazing to create the clean boundaries; cobalt without a stain over it turns black in firing, if you put a clear over it, it would be brilliant blue.

Attempt at more geometric forms; meh

Medium size bowl; very round but a bit heavy. There's a ring of unglazed cobalt around the rim on this too; that was a mistake, I should have put clear over it to bring out the blue, you can only see a tiny band of what the blue would have been like. Oribe rim and a little pour of it into the bottom, just pour a tiny bit so you don't have to pour out excess.

Medium size bowl. Inside yellow salt, outside lung chuan. I did the outside by doing first a very thin watered down glaze layer (it's almost invisible but takes off the naked harshness of bare clay; I thinned a bit too much, just a tiny dash of water goes a long way). Then I dripped glaze while the pot spun to create very lumpy thick random application, sort of like sand castles. I think there's potential in this technique, will explore it further.

Small bowl; glaze base lung chuan, oribe splatter application by flicking a paint brush loaded with glaze. Not bad. Maybe stain splatter under clear glaze would be better.

6/07/2011

06-07-11 - How to read an LZ compressed file

An example of the kind of shite I'm doing in Oodle these days.

You have an LZ-compressed file on disk that you want to get decompressed into memory as fast as possible. How do you do this ?

Well, first of all, you make your compressor write in independent chunks so that the decompressor can run on multiple chunks at the same time with threads. But to start you need to know where the chunks are in the file, so the first step is :


1. Fire an async read of the first 64k of the file to get the header.

the header will tell you where all the independent chunks are. (aside : in final Oodle there may also be an option to aglomerate all the headers of all the files, so you may already have this first 64k in memory).

So after that async read is finished, you want to fire a bunch of decomps on the chunks, so the way to do this is :


2. Make a "Worklet" (async function callback) which parses the header ; set the Worklet to run when the IO op #1
finishes.

I used to do this by having the WorkMgr get a signal from IO thread (which still happens) but I now also have a mechanism to just run Worklets directly on the IO thread, which is preferrable for Worklets that are super trivial like this one.

Now, if the file is small you could just have your Worklet #2 read the rest of the file and then fire async works on each one, but if the file is large that means you are waiting a long time for the IO before you start any decomp work, so that's not ideal, instead what we do is :


3. In Worklet #2, after parsing header, fire an async IO for each independent compressed chunk.  For each chunk, create
a decompression Worklet which is dependent on the IO of that chunk (and also neighbors, since due to IO
sector alignment the compression boundaries and IO boundaries are not quite the same).

So what this will do is start a bunch of IO's that then retire one by one, as each one retires it starts up the decomp task for that chunk. This means you start decompressing almost immediately and for large files you keep the CPU and IO busy the whole time.

Finally the main thread needs a way to wait for this all to be done. But the handles to the actual decompression async tasks don't exist until async task #2 runs, so the main thread can't wait on them directly. Instead :


4. At the time of initial firing (#1), create an abstract waitable handle and set it to "pending" state; then
pass this handle through your async chain.  Task #2 should set it to needing "N to go", since it's the first
point that knows the count, and then the actual async decompresses in #3 should decrement that counter.  So
the main thread can wait on it being "0 to go".

You can think of this as a sempahore, though in practice I don't use a semaphore because there are some OS's where that's not possible (sadly).

What the client sees is just :


AsyncHandle h = OodleLZ_StartDecompress( fileName );

Async_IsPending(h); ?

Async_Block(h);

void * OodleLZ_GetFinishedDecompress( h );

if they just want to wait on the whole thing being done. But if you're going to parse the decompressed file, it's more efficient to only wait on the first chunk being decompressed, then parse that chunk, then wait on the next chunk, etc. So you need an alternate API that hands back a bunch of handles, and then a streaming File API that does the waiting for you.

6/04/2011

06-04-11 - Keep Case

I've been meaning to do this for a long time and finally got off my ass.

TR (text replace) and zren (rename) in ChukSH now support "keep case".

Keep case is pretty much what you always want when you do text replacement (especially in source code), and everybody should copy me. For example when I do a find-replace from "lzp1f" -> "lzp1g" what I want is :


lzp1f -> lzp1g  (lower->lower)
LZP1F -> LZP1G  (upper->upper)
Lzp1f -> Lzp1g  (first cap->first cap)
Lzp1G -> Lzp1G  (mixed -> mixed)

The kernel that does this is matchpat in cblib which will handle rename masks like : "poop*face" -> "shit*butt" with keep case option or not.

In a mixed-wild-literal renaming spec like that, the "keep case" applies only to the literal parts. That is, "poop -> shit" and "face -> butt" will be applied with keep-case independently , the "*" part will just get copied.

eg :


Poop3your3FACE -> Shit3your3BUTT

Also, because keep-case is applied to an entire chunk of literals, it can behave somewhat unexpectedly on file renames. For example if you rename

src\lzp* -> src\zzh*

the keep-case will apply to the whole chunk "src\lzp" , so if you have a file like "src\LZP" that will be considered "mixed case" not "all upper". Sometimes my intuition expects the rename to work on the file part, not the full path. (todo : add an option to separate the case-keeping units by path delims)

The way I handle "mixed case" is I leave it up to the user to provide the mixed case version they want. It's pretty impossible to get it right automatically. So the replacement text should be provided in the ideal mixed case capitalization. eg. to change "HelpSystem" to "QueryManager" you need to give me "QueryManager" as the target string, capitalized that way. All mixed case source occurances of "HelpSystem" will be changed to the same output, eg.


helpsystem -> querymanager
HELPSYSTEM -> QUERYMANAGER
Helpsystem -> Querymanager
HelpSystem -> QueryManager
HelpsYstem -> QueryManager
heLpsYsTem -> QueryManager
HeLPSYsteM -> QueryManager

you get it.

The code is trivial of course, but here it is for your copy-pasting pleasure. I want this in my dev-studio find/replace-in-files please !


// strcpy "putString" to "into"
//  but change its case to match the case in src
// putString should be mixed case , the way you want it to be if src is mixed case
void strcpyKeepCase(
        char * into,
        const char * putString,
        const char * src,
        int srcLen);

void strcpyKeepCase(
        char * into,
        const char * putString,
        const char * src,
        int srcLen)
{   
    // okay, I have a match
    // what type of case is "src"
    //  all lower
    //  all upper
    //  first upper
    //  mixed
    
    int numLower = 0;
    int numUpper = 0;
    
    for(int i=0;i<srcLen;i++)
    {
        ASSERT( src[i] != 0 );
        if ( isalpha(src[i]) )
        {
            if ( isupper(src[i]) ) numUpper++;
            else numLower++;
        }
    }
    
    // non-alpha :
    if ( numLower+numUpper == 0 )
    {
        strcpy(into,putString);
    }
    else if ( numLower == 0 )
    {
        // all upper :
        while( *putString )
        {
            *into++ = toupper( *putString ); putString++;
        }
        *into = 0;
    }
    else if ( numUpper == 0 )
    {
        // all lower :
        while( *putString )
        {
            *into++ = tolower( *putString ); putString++;
        }
        *into = 0;
    }
    else if ( numUpper == 1 && isalpha(src[0]) && isupper(src[0]) )
    {
        // first upper then low
        
        if( *putString ) //&& isalpha(*putString) )
        {
            *into++ = toupper( *putString ); putString++;
        }
        while( *putString )
        {
            *into++ = tolower( *putString ); putString++;
        }
        *into = 0;
    }
    else
    {
    
        // just copy putString - it should be mixed 
        strcpy(into,putString);
    }
}


ADDENDUM : on a roll with knocking off stuff I've been meaning to do for a while ...

ChukSH now also contains "fixhtmlpre.exe" which fixes any less-than signs that are found within a PRE chunk.

Hmm .. something lingering annoying going on here. Does blogger convert and-l-t into less thans?

ADDENDUM : yes it does. Oh my god the web is so fucked. I've been doing a bit of reading and it appears this is a common and atrocious hack. Basically the problem is that people use XML for the markup of the data transfer packets. Then they want to sent XML within those packets. So you have to form some shit like :


<data> I want to send <B> this </B> god help me </data>

but putting the less-thans inside the data packet is illegal XML (it's supposed to be plain text), so instead they send

<data> I want to send &-l-tB> this &-l-t/B> god help me </data>

but they want the receiver to see a less-than, not the characters &-l-t , so the receiver parses those codes back into less-than and then treats the data received as its own hunk of XML with internal markups.

Basically people use it as a way to send codes that the current parser will ignore, but the next parser will see. There are lots of pages about how this is against compliance standards but nobody cares and it seems to be widespread.

So anyway, the conclusion is : just changing less thans to &-l-t works fine if you are just posting html (eg. for rants.html it works fine) but for sending to Blogger (or probably any other modern XML-based app) it doesn't.

The method I use now which seems to work on Blogger is I convert less thans to


<code><</code>

How is there not a fucking "literal" tag ? (There is one called XMP but it's deprecated and causes line breaks, and it's really not just a literal tag around a bunch of characters, it's a browser format mode change)

6/03/2011

06-03-11 - Recommend me a video game

I think I tried this post before, but let's try again :

No space marines. No WW2. How about just no marine/soldier theme in general. I'm not a big "violence in video games is bad" banger, I just think it's boring. And it's not my fantasy. I don't want to be a fucking soldier, shooting people is horrible, why do you want to pretend to do that?

Not gray or brown. Give me some damn color and beauty. I want to be excited to see the next level, and I want to be surprised and blown away and delighted when I do. There are endless possibilities for fantasy worlds, you can do better than warehouses.

Not primarily about combat ; especially not repetitive combat, like shoot these 20 guys, okay now shoot 20 more guys. Bleh, I'm bored.

Not "Don Bluth" semi-interactive , ala "Press A now" - you pressed A! good for you, monkey! (I'm looking at you, God of War). (actually I kind of liked those games when I was a kid even though it was uncool to like them; my favorite was "Cliff Hanger" which I just learned is actually a Lupin The Third movie!! crazy, when I watched Lupin a few years ago I didn't make the connection)

No giant inventory trees or management games or spread sheets; I do work when I'm working, I don't need to do work in my game.

No abstract puzzle games. I'll play chess or go or something if I want that.

Absolutely no frustrations. Long load times or annoying UI or one bad level that gives me instadeath for no reason - the game's going in the bin. If the game makes me scream at it or grind my teeth in frustration, I don't need that in my life.

My favorite games are generally "light RPG's" like Faery Tale, Zelda, Drakan, where I get to run around a big world, but without a bunch of fucking dialogs, and without managing a big group of characters and stats and such (I generally love true RPG's for about the first quarter of the game, but then you get too many people in your party and too many items and spells and it just becomes a huge pain in the ass).

06-03-11 - Amalgamate

So, here you go : amalgamate code & exe (105k)

The help is this :


amalgamate built May 19 2011, 18:01:28
args: amalgamate
HELP :
usage : amalgamate [-options] <to> <from1> [from2...]
options:
-q  : quiet
-v  : verbose
-c  : add source file's local dir to search paths
-p  : use pragma once [else treat all as once]
-r  : recurse from dirs [list but don't recurse]
-xS : extension filter for includes S=c;cpp;h;inl or whatever
-eS : extension filter for enum of from dir
-iS : add S to include path to amalgamate

from names can be files or dirs
use -i only for include dirs you want amalgamated (not system dirs)

What it does : files that are specified in the list of froms (and match the extension filter for enum of from dir), or are found via #include (and match the extension filter for includes), are concatted in order to the output file. #includes are only taken if they are in one of the -I listed search dirs.

-p (use pragma once) is important for me - some of my #includes I need to occur multiple times, and some not. Amalgamate tells the difference by looking for "pragma once" in the file. eg. stuff like :

#define XX stuff
#include "use_XX.inc"
#define XX stuff2
#include "use_XX.inc"
needs to include the .inc both times. But most headers should only be included once (and those have #pragma once in them).

So for example I made a cblib.h thusly :


amalgamate cblib.h c:\src\cblib c:\src\cblib\LF c:\src\cblib\external -Ic:\src -p -xh;inc;inl -eh

which seems to work. As another test I made an amalgamated version of the test app for rmse_hvs_t that I gave to Ratcliff. This was made with :

amalgamate amalgamate_rmse_hvs_t.cpp main_rmse_hvs_t.cpp rmse_hvs_t.cpp -I. -v -Ic:\src -p

and the output is here : amalgamate_rmse_hvs_t.zip (83k)


But for anything large (like cblib.cpp) this way of sticking files together just doesn't work. It should be obvious why now that we're thinking about it - C definitions last until end of file (or "translation unit" if you like), and many files have definitions or symbols of the same name that are not the same thing - sometimes just by accidental collision, but often quite intentionally!

The accidental ones are things like using "#define XX" in lots of files ; you can fix those by always using your file name in front of definitions that you want to only be in your file scope (or by being careful to #undef) - also local namespacing variables and etc. etc. So you can deal with that.

But non-coincidental collisions are quite common as well. For example I have things like :

replace_malloc.h :
  #define malloc my_malloc

replace_malloc.c :
  void * my_malloc ( return malloc(); }

It's very important that replace_malloc.c doesn't include replace_malloc.h , but when you amalgamate it might (depending on order).

Another nasty one is the common case where you are supposed to do some #define before including something. eg. something like :

#define CB_HUFFMAN_UNROLL_COUNT 16
#include "Huffman.h"
that kind of thing is destroyed by amalgamate (only the first include will have effect, and later people who wanted different numbers don't get what they expected). Even windows.h with the WINNT_VER and LEAN_AND_MEAN gets hosed by this.

You can also get very nasty bugs just by tacking C files together. For example in plain C you could have :

file1 : 
static int x;
file 2 :
int x = 7;
and in C that is not an error, but now two separate variables have become one when amalgamated. I'm sure there are tons of other evil hidden ways this can fuck you.

So I think it's basically a no-go for anything but tiny code bases, or if you very carefully write your code for amalgamation from the beginning (and always test the amalgamated build, since it can pick up hidden bugs).

06-03-11 - New race track near Olympia

The Ridge race track in Shelton is scheduled to open this fall. It certainly doesn't look like it from the video but they claim they are ahead of schedule, and already fully booked through 2012.

The Ridge has a decent road course, but it's rather short, it's 2.42 miles, a small upgrade over the 2.25 miles of Pacific Raceways ; I also really don't like the long straight into a chicane that they are planning at The Ridge, that is the exact kind of feature that kills people. Hopefully PCA runs it without the chicane.

Portland International Raceway (PIR) is even shorter at 1.9 miles ; I found that running the 2.25 mile track at Pacific Raceways was unpleasantly short, it just feels a bit too much like being a hamster on a wheel or just driving around a roundabout, since you're turning the same direction pretty much the whole time; PIR and Pacific Raceways both suck because they have a drag strip sharing the straight with the road coarse ; drag strips have a variety of surface problems that make them incredibly dangerous to drive over, and there have been several nasty crashes over the years that happen right at the point where the road coarse comes onto the drag strip. So The Ridge is big win in that it doesn't have that. Also since it's new it is presumably being designed with decent run-offs and barriers, which our old tracks don't have.

I just found out last week that in the same place (Shelton) you can rent out the airport to drive on; at first I thought you could rent part of the actual runway, which would be awesome, but in fact it's just a big parking lot that the airport owns, which is still better than nothing. Also if you look at Google satellite maps you might see a race track in Shelton already exists. That's the WA State Patrol training track. Bastards.

There's a huge boom of race track development up here right now.

Oregon Raceway Park opened last year, and looks awesome, though a bit too far away from Seattle. It's out in the barren grass land, away from all our deadly trees; I like tracks that are wide open like that because you can see way ahead to know if somebody has had an accident in front of you.

Bremerton has been trying to build a track for a while (currently the old airport there is used for car events, but it's hardly a track); they're hoping to get a Nascar stop, but I doubt it since they're out in the middle of nowhere. Boardman, OR is also dreaming of making this huge PNW Motorspork Park complex with multiple tracks, but they don't have funding yet and that seems like a pipe dream.


It's very hard to find places to drive fast up here. My understanding is that in Europe the track days are open in the sense that you just show up and pay a fee. In the US that does not exist at all because of liability shit. It's always through some kind of club, and they have to pretend that it's "education". (there's also obviously racing, but that's always through a club and then you have to have a race license and a car with cage and fire system and all that). So all the track days here are called "driver education" which is a bit confusing.

Whether or not the event is actually run as education depends on the group and how nitty they are. With some clubs it's a very thin facade of eduction and they immediately start doing donuts and racing each other. Others are a bit more careful to not violate their insurance policy terms. PCA for example is pretty nitty in the beginning, but once you get signed off to go solo it's basically racing.

The Proformance school at Pacific Raceways, for example, is super nitty and pedantic at first, you have to take the class, and the class is pretty terrible; they actually put up cones on the track to force you to drive "the line". Once you take the class then you can do open lapping after that which is okay.

A lot of the problem with these "racing education" classes is that they are just horrible teachers. They're super pedantic and just not very smart. They teach you what you're supposed to do without teaching you *why* you're supposed to do it, and they don't let you experiment and learn for yourself. It reminds me of the bad old days of being in primary education with small-minded teachers who are teaching you the exact machinery of how to do something instead of teaching you the fundamentals and letting you do it however you want. At the first "ground school" that I went to some guy describes early apex and late apex and then asks "what's the right way to corner?" , and here I am still being optimistic and engaged I say "it depends, it depends on the track surface, and what corners are before and after the current corner", and he says "no, we always late apex". Oh, okay, my bad, I though we were human beings who could think and discuss and be realistic and intelligent, in fact we are just supposed to repeat some rote nonsense that you read in a rule book once and treat like dogma. So the "education" is unfortunately just really depressing usually.

A couple of us did the Dirtfish Rally School out in Snoqualmie recently, and it definitely suffers from being overly pedantic. Their faccility is amazing, it's literally like a video game level with rusted old warehouses and big gravel pits, and driving on gravel is really a shit-load of fun, I was jumping with joy every time I got out of the car. I like gravel track driving a lot better than driving on pavement because you get car rotation and so much more crazy weight transfer dynamics and fun stuff going on and slow safe speeds. But I don't really recommend Dirtfish, it's too expensive for the amount of seat time you get, and they're just a bit too serious about doing things the right way. It would be worth it if doing the class qualified you for open lapping (for example the Proformance class is excruciating but the whole point of it is to qualify for open lapping) but of course Dirtfish won't let you do open lapping in their gravel pit.

6/02/2011

06-02-11 - Shark 007 codecs are malware!

Beware. I had an old version of the Shark 007 codecs installed on my machine. I'm having some stupid video problem (*) ...

(* = my screen flashes once when I start playing a video and then again when I stop; I believe this is something in the GPU; I've tried to turn off all GPU acceleration for video playback, but it's hard to be sure I've got it all because video is such a cluster-fluck; furthermore the damn ATI catalyst has lots of "smart" override modes where it tries to turn on GPU acceleration even when you didn't ask for it, and do things like automatic interlace fixing and "smooth video", all of which I try to turn off but again it's hard to say for sure that I've actually de-fucked the driver (god damnit, give me a fucking video card driver that just does what the D3D calls tell it to do, don't fucking change the mip mode or anti-alias mode or any such shit); in the end I think I tracked it down (**) )

... so after trying various things I figured I would try changing to the newest version of the Shark codecs. Big mistake. It installs Bing and Ask toolbars without asking permission. Fucker. Nicely, Firefox blocks those addons now, but Firefox doesn't let you uninstall blocked addons which is a bit annoying (you can of course uninstall manually).

(** = I'm pretty sure the problem was the ATI "PowerPlay" clock changing feature. Then the GPU load changes, it clocks up, and when it changes the clock rate your screen flashes. If I just put the GPU at max clock all the time, then I don't get any flashes. This sort of sucks, because I do want to use PowerPlay but I don't want any damn screen flash. The big problem with leaving the GPU at full clock for me is that it makes the machine hotter, which causes the fans to step up to their higher speeds, which makes them quite noisey; with GPU in power save mode the machine is very quiet indeed; I'm also not sure why the GPU has to clock up to play video, is it because some fucker is still using GPU acceleration? or would it do that even with just CPU playback?).

(BTW a lot of people have noticed a similar problem with Flash Player since v10 ; they turned on GPU acceleration for video by default, and now your screen pops when you start and stop flash videos. You can fix it by turning off hardware acceleration in the Flash setting).

Hey bone heads! You don't need to hardware accelerate something that plays back just fine on the CPU! Especially when "hardware accelerate" means "break" as it inevitably does because GPU drivers/users are just always fucked up.

Sometimes I wish you could still buy/use cards like the sweet old Matrox 2d-only cards that just never had any problems. I used to always buy those cards for my home machines back in the Voodoo days and they were solid as a rock.


On a related malware note, for a while now I've been doing "safe browsing" by running Firefox from my ram disk. That way any changes made to my profile by malignant sites just go away when I reboot. I started doing this because of Firefox's insistence on thrashing my disk for its SQL db, and discovered the safety as a side effect.

For example, when you stumble on an attack site, it can be dangerous to even close the window to that site (because they can run on-close triggers). It's better if you just don't click any popup or do anything at all. Instead I run a batch that does :


pskill -t firefox.exe
call dele -r s:\*
call setup_firefox_ramdisk.bat

(s: is my ramdisk that I run firefox from).

The other nice safety feature is that you can wipe cookies and all saved/cached state whenever you want. I used to try to browse with things like Flash completely disabled, no cookies, etc. but it's just impossible to use the modern internet that way. But I still don't want any website to remember my settings ever, so I start from a clean slate each time. (actually I start from whatever saved state that I want, so I saved the state with my google login saved). It's way better than actually browsing with no cookies or even just clearing your browser cache periodically.


On a more positive note, I found this and love it :

Hide Comments with AdBlock Plus

I'm now using these rules :


youtube.com##div[id="comments-view"] 
thestranger.com##div[id="BrowseComments"]
seattleweekly.com##div[id="comments"]

I really would like to hide comments for *every* site I ever go to. I find that hearing from the common man is absolutely toxic. Sometimes it is just infuriating and depressing how terrible they are, when they say things that are racist or ignorant or small minded or just petty bickering about Justin Bieber. But even when the comments aren't as obviously toxic, they are still very bad for the brain, because you are influenced by them whether you think so or not, and that influence is almost always negative.

old rants