# cbloom rants

## 8/14/2011

### 08-14-11 - A note on convex hull simplification

I wrote this in email and thought it worth recording.

It's a little unclear, so I clarified :

My algorithm is very simple and by no means optimal.

I construct a standard (exact) convex hull, then make a mesh from it. I then run a normal mesh simplifier (see for example Garland Heckbert Quadric Error Metrics) to simplify the CH as if it was a mesh. This can ruin inclusion. I then fix it by taking all the face planes of the simplified mesh and pushing them out past any vert in the original mesh.

Stan's (Melax - Convex Hull Simplification With Containment By Successive Plane Removal) way is similar but better. He uses a BSP engine to create the hull. First he finds a normal convex hull. Then he considers only the planes that make up that hull. The working hull is the volume that is on the "front" side of all planes. He then considers removing planes one by one. When you remove a plane, the cost to remove it is the volume that is added to the hull, which is the volume of the space that is on the back side of that plane but is on the front side of all other planes. You create a heap to do this so that the total cost to simplify is only N log N. This requires good BSP code which I don't have, which is why I used the mesh-simplifier approach.

An alternative in the literature is the "progressive hull" technique. This is basically using PM methods but directly considering the mesh as a hull during simplification instead of fixing it after the fact as I do. Probably a better way is to use a real epsilon-hull finder from the beginning rather than finding the exact hull and then simplifying.

My code is in Galaxy4 / gApp_HullTest which is available here ; You should be able to run "Galaxy4.exe hull" ; Hit the "m" key to see various visualations ; give it a mesh argument if you have one (takes .x, .m , .smf etc.)

BTW to summarize : I don't really recommend my method. It happens to be easy to implement if you have a mesh simplifier lying around. Stan's method is also certainly not optimal but is easy to implement if you have good BSP code lying around (and is better than mine (I suspect)).

The technique I actually prefer is to just use k-dops. k-dops are the convex hull made from the touching planes in a fixed set of k directions. Maybe find the optimal OBB and use that as the axis frame for the k directions. Increase k until you are within the desired error tolerance (or k exceeds the number of faces in the exact hull).

ASIDE : I have some BSP code but I hate it; I hate all floating point geometry code. I love integer geometry code. The problem with integers in BSP's is that clipping creates rational points. Maybe I'll write some BSP routines based on rational Vec3's. The only big problem is that the precision requirement goes up with each clip. So you either need arbitrary precision rationals or you have to truncate the precision at some maximum, and then handle the errors created by that (like the truncated point could move onto the back side of a plane that you said you were in front of). (this is better than the errors in floating points, because at least the truncated point is at a definite well defined location, floating points move around depending on how you look at them, those wiggly bastards) (I'm tempted to say that they're like quantum mechanics in that they change when you measure them, except that they really aren't at all, and that's the type of pseudo-scientific-mumbo-jumbo that pseudo-intellectual fucktards love and I so despise, so no, I won't say it).

## 8/12/2011

### 08-12-11 - The standard cinit trick

Sometimes I like to write down standard tricks that I believe are common knowledge but are rarely written down.

Say you have some file that does some "cinit" (C++ class constructors called before main) time work. A common example is like a factory that registers itself at cinit time.

The problem is if nobody directly calls anything in that file, it will get dropped by the linker. That is, if all uses are through the factory or function pointers or something like that, the linker doesn't know it gets called that way and so drops the whole thing out.

The standard solution is to put a reference to the file in its header. Something like this :

```
Example.cpp :

int example_cpp_force = 0;

AT_STARTUP( work I wanted to do );

Example.h :

extern int example_cpp_force;

AT_STARTUP( example_cpp_force = 1 );

```
where AT_STARTUP is just a helper that puts the code into a class so that it runs at cinit, it looks like this :
```
#define AT_STARTUP(some_code)   \
namespace { static struct STRING_JOIN(AtStartup_,__LINE__) { \
STRING_JOIN(AtStartup_,__LINE__)() { some_code; } } STRING_JOIN( NUMBERNAME(AtStartup_) , Instance ); };

```
Now Example.obj will be kept in the link if any file that includes Example.h is kept in the link.

This works so far as I know, but it's not really ideal (for one thing, if Example.h is included a lot, you get a whole mess of little functions doing example_cpp_force = 1 in your cinit). This is one of those dumb little problems that I wish the C standards people would pay more attention to. What we really want is a way within the code file to say "hey never drop this file from link, it has side effects", which you can do in certain compilers but not portably.

## 8/11/2011

### 08-11-11 - Inflation - 4

Now some random hand waving and incoherent thoughts.

The alternative money supply metrics like TMS or "M1 + deposit currency" show that the money supply has massively increased since the great recession (roughly the same as the fed "BASE" ). Historically TMS tracks inflation very closely. So far it appears not to be. One possible explanation is that we basically continue to be in a recession, which is driving prices down, while also inflating the currency to keep prices stable.

Of course M2 and M3 are still way down due to lack of money multiplier in our recessed economy. In theory if credit picks up again and M3 takes off, the Fed will crank back on the money supply and keep things under control. But I believe there are reasons to be skeptical that the Fed will ever really crack down on the loose monetary policy.

The PIMCO newsletter is surprisingly bleak -

the only way out of the dilemma, absent very large entitlement cuts, is to default in one (or a combination) of four ways: 1) outright via contractual abrogation � surely unthinkable, 2) surreptitiously via accelerating and unexpectedly higher inflation � likely but not significant in its impact, 3) deceptively via a declining dollar� currently taking place right in front of our noses, and 4) stealthily via policy rates and Treasury yields far below historical levels � paying savers less on their money and hoping they won�t complain.

Basically the US has got itself into massive debt; I don't really agree that entitlements are the big problem, certainly not in the short term, but anyhoo. When you're a debtor, inflation is great for you - it makes your debt smaller. We fund the debt by selling treasuries. Our debt is much cheaper to fund if we can offer a very low return on treasuries. So the best option for the US government is clearly to inflate the currency and devalue the dollar (reduces our debt) while claiming that inflation is low (keeps treasury yields low).

In other crackpottery, QE is very strange. My very crude cliff notes :

```
To inject cash into the economy, the Fed bought treasuries from private holders
this takes out non-cash "paper" and adds to the money supply

The government is in debt; to finance that debt it sells treasuries
this gives the government cash in exchange for paper

So during QE, banks bought bonds from Treasury, then sold them to the Fed

Basically the Fed was just giving cash to Treasury, but passing it through banks so they could take a piece of profit

```
Oddly the US apparently has a law that prevents the Fed from directly buying Treasuries and thus supporting the government debt; also Bernanke and such claim they are not "printing money to monetize the debt" but really passing the money through the open market doesn't change much other than giving a slice of profit to the banks. Says the Dallas Fed : "For the next eight months, the nation�s central bank will be monetizing the federal debt."

Furthermore there is good evidence that QE largely backfired at injecting money into the economy. The problem is that with the fed funds rate at 0% and Treasuries at 2-2.5% , banks can take out fed funds, buy treasuries, then sell them to the fed under QE. Free money for the banks and the Fed gets to pay for government debt, yay.

This is certainly part of why bond yields are so low; they don't need to return much when cash is free. Apparently Japan has been through all this before. There's evidence that QE schemes basically never work; when the fed funds rate is near zero, all possible profitable investments that can be made already have been made, so why in the world would injecting more liquidity into the market help? The only explanation I can see is to intentionally devalue the currency or cause inflation. (of course as we'll note later, QE was not only a monetary policy, it was also a direct and corrupt subsidy for toxic asset holders)

Side note : the Dallas Fed uses Trimmed Mean PCE instead of Core PCE. If you read the press releases from the Dallas Fed or St. Louis Fed it is encouraging that there are still technocrats in government that are trying to be reasonably honest and do their jobs well. Of course they tend to get squashed by the people in power, but still ...

What is QE really doing? Propping up stock values and other investments (particularly MBS'es). Giving banks free profits. Createing an outflow of money from the US to emerging markets. Sending up commidity prices.

"QE effects on commodity markets have been significant. Between August 2011 and January 2011, commodity prices (as measured by CRB Index) rose by 14%. Oil prices have increased by around 20% and average gasoline prices have increased around 15%. Food prices (as measured by the CRB Food Index) have increased 12%, with some individual foodstuffs rising more sharply."

It's unclear to me how you can defend dumping liquidity into the system when banks were already sitting on massive amounts of liquidity with nothing to do with it other than hold it in the Fed (*) or treasuries. It's like watering a sick plant that's already soaked in water.

* = this is one of the funnier quirks; banks actually hold their large cash balance at the federal reserve, so when they got massive cash injects from QE the main thing that happened was that their balance of cash sitting in the fed went sky high, and the fed pays interest to the banks on that deposit - banks now have excess reserves (cash beyond their reserve requirement) around \$1.5 Trillion sitting in the Fed, up from only \$2 billion in 2007. If your goal is to get banks to lend more, why do you want to make it more attractive for them to leave cash sitting in reserves? This was yet another change in 2008 as part of the massive experiment of letting the Fed tinker with the economy in unprecedented ways. Some links :

One common thread that I don't understand is why even a tiny bit of deflation is considered such a huge disaster that almost anything will be done to prevent it.

I can't imagine any way to justify the Fed purchasing MBS's and other such assets. The role of the Fed is supposed to be monetary policy, not subsidizing bad investments or otherwise interfering in markets. The Fed is not the SEC or Treasury. They just slipped it in as part of "QE" which is supposed to just be increasing the money supply and bought \$1.25 B of MBS's without even attempting a fair valuation of them. That's basically an illegal expansion of the TARP program. The amount of toxic MBS that the Fed owns now dwarfs what Treasury holds.

Another odd footnote is the fact that banks are no longer required to mark-to-market MBS's and similar products, so we have no idea what's really on their balance sheets.

Another funny one is that Fannie/Freddie have gone from about 20% of the mortgage market to about 80% since the collapse.

If it is the Fed's role to moderate volatility, then it must also slow growth in over-exuberant times. It has shown no willingness to do so in recent years and I can't imagine that it ever will, given the political corruption and collusion that drives decision making. It's like having a train conductor that stokes the fire when you slow down but refuses to use the brakes when you get going to fast.

### 08-11-11 - Free Internet

I mean "free" in a liberty sense, not a monetary sense.

Recent Seattle Weekly article got me thinking about trying to encrypt and anonymize all my internet access. The whole torrent model is just like fish in a barrel for copyright trolls. You can just hop on the net and get a list of infringers any time you want.

So whatever reason, say you want to be able to work on the net and do as you please without your actions being monitored.

Apparently the major US-based services like FindNot and Anonymizer are not to be trusted (they provide logs to the US government and to subpoenas by the RIAA etc).

Really what you want is something like Tor that takes all your traffic and bounces it around a bunch of other machines and then puts out portions of requests from all over. Currently none of those services seem to be quite ready for prime time; Tor for example kicks you out if you try to do high-bandwidth things like torrents.

In general I'm not sure if dark-nets like Tor can survive. I don't trust the internet providers or the US government to allow you to have that freedom. I suspect that if they ever caught on en masse they would be blocked by the standard extra-judicial mechanisms that they used to shut down online poker and funding WikiLeaks (where the government nicely asks the service provider to block that traffic and the provider complies, even though it's not clear the law is on their side).

The only way to get past that (and into places like china) is to hide encrypted packets inside benign packets. That may be fine for little text messages, but you can never get high bandwidth that way.

## 8/09/2011

### 08-09-11 - The Lobster

(this coinage is so obvious I must have stolen it from somewhere, anyway...)

I've been thinking a lot recently about "the lobster".

I've always thought it was bizarre how you can pull into any podunk town in America and go to the scary local diner / steak house, and there will be the regular items - burger, chicken fried steak, what have you, all under \$10, and then there's the lobster, for \$30, ridiculously overpriced, tucked in the corner of the menu with decorative squiggles around it (as if it needs velvet ropes to separate the VIP section of the menu from the plebian fare).

The thing is, the lobster is not actually good. They probably can't remember the last time anybody actually ordered the lobster. No local would; if the waitress likes you she would warn you not to get, the chefs roll their eyes when the order comes in. Why is it on the menu at all?

I guess it's just there as a trap, for some sucker who doesn't know better, for someone wanting to show off the money they just won, or someone on an expense account to waste money on. You're really just humiliating yourself when you order it, and the restaurant is laughing at you.

I think most people know that you don't actually ever order the lobster in restaurants (other than lobster-specializing places in like Maine or something). But "the lobster" can pop up in many other guises. Expensive watches are obvious lobsters, expensive cars can be less obvious lobsters (is a Maserati a lobster? an Alfa? an Aston? a Porsche?), certainly some of the options and special editions are obvious lobsters, for example the recent Porsche "Speedster" special edition that cost \$250k and was just a regular Carrera other than a few colored bits, that's clearly a lobster and Porsche laughs and rolls their eyes at the Seinfelds of the world who are stupid enough to buy the Porsche lobster just because it was on the menu with squiggly lines around it.

I feel like a lot of salesmen try to slip the lobster on you when you're not paying attention. Like when the contractor asks if you want your counters in wood or stone or italian marble - hey wait, contractor, that's the lobster! okay, yeah, you got me, I don't even know where to get italian marble but I thought I'd try to slip it in there. Home improvement in general is full of lobsters. Home theatre stores usually carry a lobster; car wheels ("rims") are rife with lobsters.

The thing that makes the nouveau riche so hilarious is they are constantly getting suckered into buying the lobster and then have the stupidity to brag about it. Ooo look at my gold plated boat ; you fool, you bought the lobster, hide your shame!

One of the things that's so satisfying about video games is that you get a clear reward for more work. You kill some monsters, you get experience, you go up a level; you collect 200 gems, now you can buy the red shield, and it is objectively better than the blue shield you had before. It's very simple and satisfying.

Life is not so clear. More expensive things are not always better. Doing more work doesn't necessarily improve your life. This can be frustrating and confusing.

One of the things that makes me lose it is video game designers who think it's a good idea to make games more realistic in this sense, like providing items in the stores that are expensive but not actually very good. No! I don't want to have to try to suss out "the lobster" in the video game blacksmith, you want video game worlds to be an escapist utopia in which it's always clear that spending more money gets you better stuff. (the other thing I can't stand is games that take away your items; god dammit, don't encourage me to do the work for that if you're going to take it away, don't inject the pains of real life into games, it does not make them better!)

### 08-09-11 - Inflation - 3

Some reference :

Some of these guys have the whiff of crackpottery which should give us a bit of pause. Nevertheless...

We can track down a few of the strange problems that I identified last time.

Education basically is miscounted : "The inclusion of financial aid has added to the complexity of pricing college tuition. Many selected students may have full scholarships (such as athletic), and therefore their tuition and fixed fees are fully covered by scholarships. Since these students pay no tuition and fees, they are not eligible for pricing." discounting financial aid makes some sense if you are trying to measure the consumer's expense, but not if you are trying to measure the cost of the good; just because someone else paid for part of it doesn't make it cheaper. But really I imagine the biggest problem with education cost is that they effectively count college as being free for people who can't afford college. That is, people who can't afford it don't buy it, so it's not in the basket (doesn't contribute to the "quantity" in the CPI metric). A better way to measure inflation would be to assume that everyone would go to college if they could afford it. (also it seems that non-acredited technical school time places are not counted at all)

Health care is simply not counted at all, by design : "The weights in the CPI do not include employer-paid health insurance premiums or tax-funded health care such as Medicare Part A and Medicaid" The only thing they count is out-of-pocket / discretionary health care expenses, which are obviously just a tiny fraction of the total.

Real estate has the funny owner's cost to rent thing which makes it very hard to tell if that is being gamed or not.

Obviously anything based on "core" inflation (without food or energy) is ridiculous. The standard argument that those fluctuate too much seasonally is absurd, you could just use a seasonally-adjusted moving average, you don't need to remove them completely.

The other really obviously fishy parts are :

"Substitution". A while ago the CPI was changed to use geometric averages of prices within a category. This seems pretty innocuous, but it basically causes a down-weighting of higher priced items. And in fact the geometric mean is always lower than the arithmetic mean, so this change can only make inflation seem lower, which is a dirty trick. For example :

```
(1+1+8)/3 = 3.333

(1*1*8)^(1/3) = 2.0

```
pretty big difference even though they are both "means". Now, they hand wave away and say that this reflects consumers' ability to choose and substitute cheaper products. But it is totally unscientific.

Furthermore, newer measures like the CPI-U or Fed's PCE also explicitly include substitution. This just seems like it obviously does not reflect inflation. When a product gets expensive and the consumer substitutes for something cheaper, they are by definition getting something of lower utility (because it wasn't their first choice), so you can't say that no inflation happened, they are getting less for their money.

"Hedonics". These are poorly documented pure bullshit ways of pretending inflation is lower by claiming that we got more for our money. This is just pure nonsense for various reasons :

1. The whole definition of "better" is so vague and open to interpretation that it has no business in a metric. For example they consider air travel to be massively improved since the 70's. Sure it's safer, more efficient, but also much much less pleasant. Personally I think that the same trip is actually worth much less now than it was in the past, but they say it's worth much more. Similarly for the quality of buildings and clothing and cars and so on; yes, they're safer, faster, more durable, whatever, but they aren't hand crafted, they aren't made of hard wood and steel and chrome; I think most of those things are actually much crappier now than ever, made more cleverly but also more cheaply. Anyway, it just has no business in there. The idea that you can measure the hedonic quality of some product and say it improved by 0.1% from April to May in 2010 is just absurd.

2. Using quality of goods at all just isn't right to begin with. Inflation should be a measure of the cost to buy a standard set of goods at the expected quality level of the era. Just because technology gets better over time doesn't mean you can discount the inflation! For example if computers get 50% better every year and our money is inflating by 50% would you say the cost of computers is not changing? Of course not, the cost is going up 50% , yes they are also getting better but that is not part of the discussion.

It's just wrong on the face of it. If a median decent car was \$5k in the 70's and now is \$25k , then the price of cars has gone up by 5X. But oh no they say, you have air bags and more power and fuel economy and so on, the modern car is 5X better, so in fact there has been no inflation at all. Well, wait a minute. I *expect* the quality of life and technology to go up over time. Are you telling me that in a world with 0% inflation that technology does not get better over time? That's a strange way to measure things. And it's not really what you want to know when you ask about inflation. You want to know how much money do I need to afford a decent house, car, food, etc. at the expected standard of the time that I buy it.

Obviously we wish there was some item that had absolute constant value that we could measure against. Also obviously measuring inflation is very complicated and we are only scratching the surface. But it's very fishy. Rotten fishy.

### 08-09-11 - Inflation - 2

The government has made several significant changes to how it counts inflation. A big one occurred in 1995 (Boskin commission), another happened just in the last two years (chained CPI for COLA), and at some point the Fed changed it's core measure (PCE).

In all cases they claimed to be making the inflation measure "more accurate" , and in all cases, the inflation rate was revised downward.

Now, ignoring the details of the changes for now, it should be clear the government has a very strong interest in reporting a low inflation number.

1. It makes their administration look better. If inflation is low, then inflation-adjusted GDP looks better. There are lots of horrifying statistics about inflation-adjusted median income that are very embarassing to the US government, and you can make that go away by having a lower inflation number.

2. Lots of federal costs have automatic COLA (cost of living adjustments) like Social Security, Medicare, federal employee pensions an wages, etc. A lower inflation number directly decreases the amount they have to pay out.

3. They pay less out for TIPS

Whenever governments can lie for their own benefit, they tend to, so it would be *extremely* surprising if the reported inflation was actually correct.

## 8/08/2011

### 08-08-11 - Inflation - a sanity check

```
Fuel prices have risen faster than (nominal) inflation.

Education - much faster than inflation.

Housing faster than inflation.

Food faster than inflation.

Health care faster than inflation.

Gold, copper, etc. - much faster than inflation.

Foreign currency - faster than inflation.

```
Ummm....

Something is wrong with this picture.

## 8/01/2011

### 08-01-11 - Non-mutex priority inversion

An issue I don't see discussed much is non-mutex priority inversion.

First a review of mutex priority inversion. A low priority thread locks a mutex, then loses execution. A high priority thread then tries to lock that mutex and blocks. It gives up its time slice, but a bunch of medium priority threads are available to run, so they take all the time and the low priority thread doesn't get to run. We call it "priority inversion" because the high priority thread is getting CPU time as if it was the same as the low priority thread.

Almost all operating systems have some kind of priority-inversion-protection built into their mutex. The usual mechanism goes something like this : when you block on a mutex, find the thread that currently owns it and either force execution to go to that thread immediately, or boost its priority up to the same priority as the thread trying to get the lock. (for example, Linux has "priority inheritance").

The thing is, there are plenty of other ways to get priority inversion that don't involve a mutex.

The more general scenario is : a high priority thread is waiting on some shared object to be signalled ; a low priority thread will eventually signal that object ; medium priority threads take all the time so the low priority thread can't run, and the high priority thread stays blocked.

For example, this can happen with Semaphores, Events, etc. etc.

The difficulty is that in these cases, unlike with mutexes, the OS doesn't know which thread will eventually signal the shared object to let the high priority thread go, so it doesn't know who to boost.

Windows has panic mechanisms like the "balance set manager" which look for any thread which is not waiting on a waitable handle, but is getting no CPU time, then they force it to get some CPU time. This will save you if you are in one of these non-mutex priority-inversions, but it takes quite a long time for that to kick in, so it's really a last ditch panic save, if it happens you regret it.

Sometimes I see people talking about mutex priority inversion as if that's a scary issue; it's really not on any modern OS. But non-mutex priority inversion *is*.

Conclusion : beware using non-mutex thread flow control primitives on threads that are not of equal priority !

### 08-01-11 - Double checked wait

Something that we have touched on a few times is the "double checked wait" pattern. It goes like this :
 ```consumer : if ( not available ) { prepare_wait(); if ( not available ) { wait(); } else { cancel_wait(); } } producer : make available signal_waiters(); ```
now, why do we do this? Well, if you did just a naive check like this :
```
consumer :

if ( not available )
{
// (*1)
wait();
}

producer :

make available
signal_waiters();

```
you have a race. What happens is you check available and see none, so you step in to *1 ; then the producer runs, publishes whatever and signals - but there are no waiters yet so the signal is lost. Then you go into the wait() and deadlock. This is the "lost wakeup" problem.

So, the double check avoids this race. What must the semantics of prepare_wait & wait be for it to work? It's something like this :

Any signal that happens between "prepare_wait" and "wait" must cause "wait" to not block (either because the waitable handle is signalled, or through some other mechanism).

Some implementations of a prepare_wait/wait mechanism may have spurious signals; eg. wait might not block even though you shouldn't really have gotten a signal; because of that you will usually loop in the consumer.

Now let's look at a few specific solutions to this problem :

condition variables

This is the locking solution to the race. It doesn't use double-checked wait, instead it uses a mutex to protect the race; the naive producer/consumer is replaced with :

```
consumer :

mutex.lock();
if ( not available )
{
unlock_wait_lock();
}

producer :

mutex.lock();
make available
signal_waiters();
mutex.unlock();

```
which prevents the race because you hold the mutex in the consumer across the condition check and the decision to go into the wait.

waitset

Any simple waitset can be used in this scenario with a double-checked wait. For example a trivial waitset based on Event is like this :

```
waitset.prepare_wait :
add current thread's Event to list of waiters

waitset.wait :
WaitForSingleObject(my Event)

waitset.signal_waiters :
signal all events in list of waiters

```
for instance, "waitset" could be a vector of handles with a mutex protecting access to that vector. This would be a race without the prepare_wait and double checking.

In this case we ensure the double-checked semantics works because the current thread is actually added to the waitset in prepare_wait. So any signal that happens before we get into wait() will set our Event, and our wait() will not actually block us, because the event is already set.

eventcount

Thomasson's eventcount accomplishes the same thing but in a different way. A simplified version of it works like this :

```
eventcount.prepare_wait :
return key = m_count

eventcount.wait :
if ( key == m_count )
Wait(event)

eventcount.signal_waiters :
m_count++;
signal event;

(note : event is a single shared broadcast event here)

```
in this case, prepare_wait doesn't actually add you to the waitset, so signals don't go to you, but it still works, because if signal was called in the gap, the count will increase and no longer match your key, so you will not do the wait.

That is, it specifically detects the race - it sees "was there a signal between when I did prepare_wait and wait?" , and if so, it doesn't go into the wait. The consumer should loop, so you keep trying to enter the wait until you get to check your condition without a signal firing.

futex

It just occurred to me yesterday that futex is actually another solution to this exact same problem. You may recall - futex does an internal check of your pointer against a value, and only goes into the wait if the value matches.

producer/consumer with futex is like this :

```
consumer :

if ( value = not_available )
{
futex_wait(&value,not_available);
}

producer :

value = available
futex_signal(&value);

```
this may look like just a single wait at a glance, but if we blow out what futex_wait is doing :
```
consumer :

if ( value == not_available )
{
//futex_wait(&value,not_available);

futex_prepare_wait(&value);
if ( value == not_available )
futex_commit_wait(&value);
else
futex_cancel_wait(&value);
}

producer :

value = available
futex_signal(&value);

```
we see can clearly see that futex is just double-checked-wait in disguise.

That is, futex is really our beloved prepare_wait -> wait pattern , but only for the case that the wait condition is of the form *ptr == something.

Do we like the futex API? Not really. I mean it's nice that the OS provides it, but if you are designing your own waitset you would never make the API like that. It confines you to only working on single ints, and your condition has to be int == value. A two-call API like "prepare_wait / wait" is much more flexible, it lets you check conditions like "is this lockfree queue empty" which are impossible to do with futex (what you wind up doing is just doing the double-check yourself and use futex just as an "Event", either that or duplicating the condition into an int for futex's benefit (but that is risky, it can race if not done right, so not recommended)).

BTW some of the later extensions of futex are very cool, like bitset waiting and requeue.

### 08-01-11 - A game threading model

Some random ideas.

There is no "main" thread at all, just a lot of jobs. (there is a "main job" in a sense, that runs once a frame kicks off the other jobs needed to complete that frame)

Run 1 worker thread per core; all workers just run "jobs", they are all interchangeable. This is a big advantage for many reasons; for example if one worker gets swapped out (or some outside process takes over that CPU), the other workers just take over for it, there is never a stall on a specific thread that is swapped out. You don't have to switch threads just to run some job, you can run it directly on yourself. (caveat : one issue is the lost worker problem which we have mentioned before and needs more attention).

You also need 1 thread per external device that can stall (eg. disk IO, GPU IO). If the API's to these calls were really designed well for threading this would not be necessary - we need a thread per device simply to wrap the bad API's and provide a clean one out to the workers. What makes a clean API? All device IO needs to just be enqueue'd immediately and then provide a handle that you can query for results or completion. Unfortunately real world device IO calls can stall the calling thread for a long time in unpredictable ways, so they are not truly async on almost any platform. These threads should be high priority, do almost no CPU work, and basically just act like interrupts.

A big issue is how you manage locking game objects. I think the simplest thing conceptually is to do the locking at "game object" granularity, that may not be ideal for performance but it's the easiest way for people to get it right.

You clearly want some kind of reader/writer lock because most objects are read many more times than they are written. In the ideal situation, each object only updates itself (it may read other objects but only writes itself), and you have full parallelism. That's not always possible, you have to handle cross-object updates and loops; eg. A writes A and also writes B , B writes B and also writes A ; the case that can cause deadlock in a naive system.

So, all game objects are referenced through a weak-reference opaque handle. To read one you do something like :

```    const Object * rdlock(ObjectHandle h)
```
and then rely on C's const system to try to ensure that people aren't writing to objects they only have read-locked (yes, I know const is not ideal, but if you make it a part of your system and enforce it through coding convention I think this is probably okay).

In implementation rdlock internally increments a ref on that copy of the object so that the version I'm reading sticks around even if a new version is swapped in by wrlock.

There are various ways to implement write-lock. In all cases I make wrlock take a local copy of the object and return you the pointer to that. That way rdlocks can continue without blocking, they just get the old state. (I assume it's okay for reads to get one-frame-old data) (see note *). wrunlock always just exchanges in the local object copy into the table. rdlocks that were already in progress still hold a ref to the old data, but subsequent rdlocks and wrlocks will get the new data.

One idea is like this : Basically semi-transactional. You want to build up a transaction then commit it. Game object update looks something like this :

```    Transaction t;
vector`<`ObjectHandle> objects_needed;
objects_needed = self;
for(;;)
{
wrlock on all objects_needed;

.. do your update code ..
.. update code might find it needs to write another object, then do :

add new_object to objects_needed
if ( ! try_wrlock( new_object ) )
continue; // aborts the current update and will restart with new_object in the objects_needed set

wrunlock all objects locked
if ( unlocks committed )
break; // update done
}

```
(in actual C++ implementation the "continue" should be a "throw" , and the for(;;) should be try/catch , because the failed lock could happen down inside some other function; also the throw could tell you what lock caused the exception).

There's two sort of variants here that I believe both work, I'm not sure what the tradeoffs are :

1. More mutex like. wrlock is exclusive, only one thread can lock an object at a time. wrunlock at the end of the update always proceeds unconditionally - if you got the locks you know you can just unlock them all, no problem. The issues is deadlock for different lock orders, we handle that with the try_lock, we abort all the locks and go back to the start of the update and retake the locks in a standardized order.

2. More transaction like. wrlock always proceeds without blocking, multiple threads can hold wrlock at the same time. When you wrunlock you check to see that all the objects have the same revision number as when you did the wrlock, and if not then it means some other commit has come in while you were running, so you abort the unlock and retry. So there's no abort/retry at lock time, it's now at unlock time.

In this simplistic approach I believe that #1 is always better. However, #2 could be better if it checked to see if the object was not actually changed (if it's a common case to take a wrlock because you thought you needed it, but then not actually modify the object).

Note that in both cases it helps to separate a game object's mutable portion from its "definition". eg. the things about it that will never change (maybe its mesh, some AI attributes, etc.) should be held to the side somehow and not participate in the wrlock mechanism. This is easy to do if you're willing to accept another pointer chase, harder to do if you want it to just be different portions of the same continuous memory block.

Another issue with this is if the game object update needs to fire off things that are not strictly in the game object transaction system. For example, say it wants to start a Job to do some path finding or something. You can't fire that right away because the transaction might get aborted. So instead you put it in the "Transation t" thing to delay it until the end of your update, and only if your unlocks succeed then the jobs and such can get run.

(* = I believe it's okay to read one frame old data. Note that in a normal sequential game object update loop, where you just do :

```
for each object
object->update();

```
each object is reading a mix of old and new data; if it reads an item in the list before itself, it reads new data, if it reads an item after itself, it reads old data; thus whether it gets old or new data is a "race" anyway, and your game must be okay with that. Any time you absolutely must read the most recent data you can always do a wrlock instead of a rdlock ;

You can also address this in the normal way we do in games, which is separate objects in a few groups and update them in chunks like "phase 1", then "phase2" ,etc. ; objects that are all within the same phase can't rely on their temporal order, but objects in a later phase do know that they see the latest version of the earlier phase. This is the standard way to make sure you don't have one-frame-latency issues.

*).

The big issue with all this is how to ensure that you are writing correct code. The rules are :

1. rdlock returns a const * ; never cast away const

2. game object updates must only mutate data in game objects - they must not mutate global state or anything outside of the limitted transaction system. This is hard to enforce; one way might be to make it absolutely clear with a function name convention which functions are okay to call from inside object updates and which are not.

For checking this, you could set a TLS flag like "in_go_update" when you are in the for {} loop, then functions that you know are not safe in the GO loop can just do ASSERT( ! in_go_update ); which provides a nice bit of safety.

3. anything you want to do in game object update which is not just mutating some GO variables needs to be put into the Transaction buffer so it can be delayed until the commit goes through. Delayed transaction stuff cannot fail; eg. it doesn't get to participate in the retry/abort, so it must not require multiple mutexes that could deadlock. eg. they should pretty much always just be Job creations or destructions that are just pushes/pops from queues.

Another issue that I haven't touched on is the issue of dependencies. A GO update could be dependent on another GO or on a Job completion. You could use the freedom of the scheduling order to reschedule GOs whose dependencies aren't done for later in the tick, rather than stalling.