03-29-08 - 2

cubic maxaccel maths

I figure I should write up the maths for the record since I've described it previously but haven't been totally clear. There's an implementation in testdd you can look at.

The goal is to make a curve that goes from initial position & velocity {X0,V0} to final position & velocity {X1,V1}. Position and velocity are N-d vectors. Traveling this curve, you should at no point exceed the maximum acceleration "m". m is a scalar. I'll use caps for vectors and lower case for scalars.

The lowest order polynomial curve that fits these condition is a cubic. A parametric cubic has 4 vector coefficients which can be used to fit the end position & velocities. It also has a free scalar - the time to travel the parametric interval from 0 to 1. This duration can be fit to "m" the max acceleration.

Using the Bezier form we have :

B(s) = B0 *(1-s)^3 + B1 *3*(1-s)^2*s + B2 *3*(1-s)*s^2 + B3 *s^3

s = (t / d) is parametric time
t = real time
d = duration of curve
And fitting to the end conditions :
B0 = X0
B1 = X0 + V0*(d/3)
B2 = X1 - V1*(d/3) 
B3 = X1
Now, the velocity of a cubic curve is quadratic, the acceleration is linear. The means the highest acceleration on the interval [0,1] is either at 0 or 1. So we have to just check the acceleration of the two end points :
A(0) = 6*(B0 + B2 - 2B1)/d^2
A(1) = 6*(B3 + B1 - 2B2)/d^2
Now you have to solve both end points and use the shorter of the two durations. You may want to early out to handle the trivial case, if you can use a tiny value for "d" and not exceed maximum acceleration at either end point, then just return. In fact "tiny value" could be the duration of the frame - if you can step the whole curve in one frame then just do that.

If we just consider A(0) for now we have to solve :

|A(0)| = m
m = max accel parameter


A(0)*A(0) = m^2
(B0 + B2 - 2B1)^2 = m^2 * d^4/36
( (X1-X0) - d*(2V0 + V1)/3 )^2 = (m^2/36) * d^4
You can expand out the dot product and you are left with a quartic polynomial in the duration "d". Solve this polynomial. There are in general 4 complex roots of a quartic. You must choose the lowest magnitude positive real root. There should always be a positive real root because all the coefficients are real.

Note that the quartic arose because A was a vector - if we want to just solve the 1d problem it's much easier, because the equation in d is just quadratic. Note that you can not just solve the 1d problem and run it indepedently on each dimension of a vector, because that allows maxaccel in each dimension seperately and gives you curves that are obviously axis-aligned (it breaks rotational invariance of your space).

Once you have solved for the shortest duration "d" which gives you a curve that doesn't exceed maxaccel, you can now simply step along the curve. Note that we are not simply applying a force, but taking an exact step along the curve, you evaluate the position and velocity from the curve at the end of your current time step. This is equivalent to doing an exact integration with a force that is changing.

NOTE : we are not solving this curve once and hanging on to it; our system has zero state. A new curve is solved each frame using the current position and velocity. Because the cubic is exactly constrained, if the external conditions are not changing, then solving each frame will produce exactly the same path as just following the original solution the whole way. If the external conditions are changing, then the path changes smoothly.

ADDENDUM : talked to Drew about this, he made me aware of two issues. One thing I should state clearly : when you are changing the duration of the cubic, you are not just changing how long you take to follow the same curve. Obviously changing the rate of change of the parameter would change the acceleration, but that would also change the start & end velocity. In the math we did here previously, we are scaling the duration of the curve while keeping the real-time start & end velocity constant, which means that we are changing the parametric velocity of the ends of the curve. That means the curve actually has a different shape as you scale the duration.

The other note is that it might be possible to have seperate accel/decel parameters by using a different "max accel" parameter for A(0) and A(1), since you are generally at max accel at the start and decelerating at the end. That's not necessarilly true if the start & end velocities are not zero so maybe a bit more thought is needed, but in general the idea of using different maxaccel to get a control over accel & decel seems possible.

BTW you can also acheive the same purpose using a piecewise quadratic. To use a piecewise quadratic you have 3 phases of travel :

phase 1 : accelerate up to max velocity
phase 2 : travel linearly at max velocity
phase 3 : decelerate to hit target
So in phase 1 you are fitting the end points {X0,V0} and ending max velocity. The maths for piecewise quadratic are slightly simpler, but the logic is quite a bit more complex because you have to figure out what phase to use and also handle the degenerate cases where phase 2 is not needed because you're close enough to the target.

I find the cubic is more appealing because you can just do the math and get an answer. There is one advantage of the piecewise solution, which is that you can have seperate max accel parameters for the acceleration and deceleration parts of the curve, which can give you non-time-symmetric curves.

03-29-08 - 1

While I'm at it I'll write up the PD I'm using for the record. This is not an exact solution of the PD - it's an implicit Euler step. You can actually totally solve the PD exactly, but using an implicit Euler step actually is more stable than the exact solution, it adds a sort of fake drag which is a nice thing in practice. It's also a lot nicer numerically for doing tiny steps than using the exact solution with exponential functions.

K & D are unitless config parameters of the PD
  K = 1 and D = 1 are reasonable ; D = 1 is critical damping
time_scale = desired time to converge
dt = current time step
x0,v0 = start position & velocity
x1,v1 = target position & velocity

ks = 36 * (K / time_scale)^2
kd = 9 * K * D / time_scale

scale = ( 1  + kd * dt + ks * dt^2 )

a = ( ks * (x1-x0) + ( kd + ks * dt ) * (v1-v0) ) / scale

vt = v0 + a * dt
xt = x0 + vt * dt

If you had no min velocity that would be it. Note the funny scale numbers in ks & kd ; those are just there so that the time to converge roughly matches "time_scale", and so that K = D = 1 is a good choice. Note that K,D, and time_scale are really not three independent parameters, there are only two free parameters, but it's easier to tweak if they're seperated because you can leave K and D alone and just change "time_scale" to match your desired duration to converge.

To handle min velocity you add some logic after the above PD step :

mv = min velocity

if ( |xt - x0| <= (mv * dt) )
	if ( |x1 - x0| <= (mv * dt) && |v1 - v0| <= 2*mv )
		xt = x1;
		vt = v1;
		vt = (x1 > x0) ? mv : -mv;
		xt = x0 + vt * dt
This is ugly in various ways. The first part was a pure PD that works in N-d. Now we've gone to 1d. Also the first part worked with arbitrary end velocities. This min velocity part only really works for a still target. I think it's not too hard to fix for a moving target by using a min velocity relative to the target but I haven't bothered.

Now of course PD's don't always take the same amount of time to converge, unlike the cubic. Converge time depends on initial seperation and initial velocities. With the constants in the equations above, and if you set up your tweaks right, then you can have a PD that converges in roughly "time_scale" for typical starting situations. Here's how the time varies :

PD convergence graph

the X axis is the "time_scale" parameter, and the Y axis is actualy time to converge. If it converged in the time you wanted, all the points would be on the red line (Y=X). There are four series of points for different starting distances so you can see how converge time varies for starting distances. The distances are {160,320,480,640}. In the graph the start & end points are all at zero velocity.


03-28-08 - 3

I'm trying to make a "sexy movies" list because so many people out there are just fucking retarded. Rape and disgusting old men are never sexy. Just because a movie has sex in it does not make it sensual. Cheesy 80's movies are too goofy to be sexy. Super realistic sex or lighting is also not sexy.

The ideal sensual movie is intelligent, titillating, lush. It's a very fine line between cheesy and hot, because you do want a little bit of that orange-light soft-focus over-acting-pleasure, but if you get just an ounce too much it crosses into cheese and doesn't work. The ideal movies tease

There are a lot of classic Hollywood movies from the studio age that sort of qualify, but I'm not going to include any of them because they never quite cross the line from alluring to arousing. Something like "To Have and Have Not" may in fact be more appealing to women, or pretty much any early Brando, Bogart or Cary Grant movie.

I'm also not going to list movies that have like 5 sensual minutes that are sort of out of place in the movie and don't fit in with the general mood.

Not Sexy :

  • Last Tango in Paris : this is on almost every list and it immediately shows the lister is retarded. Lots of sex but it's gross unromantic sex with a nasty old man. Pretty horrible pointless movie all around.

  • The Lover : again this is on just about on every list and is just awful. Lots of sex and the people are at least physically attractive, but the acting is just awful, the characters are not at all mentally attractive, and it's still in the disgusting rich old man with really young girl territory that so many of these shitty perverted movies seem to dwell in.

  • Dead Ringers : WTF again, on many people's lists. Gynecological torture is not sexy! (decent movie though, just not arousing)

  • Irreversible, Romance, The Piano, Bad Lieutenant, Lilya 4 Ever, etc. : my god you people are sick dumb fucks. Rape/violence/degradation is not sexy.

  • Kathleen Turner, Michael Douglas, Mickey Rourke, 9.5 weeks, Basic Instinct, etc. : all this stuff from the 80's is just not watchable any more. I'm not sure it ever was actually decent, the people in these movies tend to be really gross hollywood types. Also Kathleen Turner looks like a football player and sounds like Tom Waits. Not sexy.

  • Secretary : again this is on a lot of lists and I just find very little to like in it. The idea of it sounds good, I like Maggie Gyllenhaal, but it's just more goofy than hot, and even the goofiness doesn't really make it interesting, and it just winds up being really boring and pointless.

  • Center of the World : filthy pointless and completely unappealing.
Borderline :
  • In the Realm of the Senses : very mannered Japanese movie with explicit sex. I just found it really boring and not even the sex held my attention. Too much sex all the time and not enough plot and character development to make us want the characters.

  • The Pillow Book : a typically slow and bizarre Greenaway movie. The elaborate bizarre construction of the movie prevents us from ever really feeling too much of anything for the characters and keeps the movie rather cerebral.

  • The Cook The Thief His Wife & Their Lover : by far the most watchable and dynamic and narrative of Greenaway's movies, this is a brilliant movie, one of my favorites, it's surreal and energetic and beautiful and sexy. I do consider it a bit too strange for this list, and it doesn't quite play into the arousal/sensual mode, but if you want to put this on your list I won't complain.

  • Wild Things : kind of a decent fun little thriller/soft-core movie ; I don't find the actors/actresses very attractive and there's only like 5 minutes of sexy part in the swimming pool, but it's not an awful movie so it gets an honorable mention.

  • Blue Velvet : this shows up on a lot of "sexy movie" lists. It's an absolute masterpiece of a movie, but I don't really think it belongs in the "sexy" category. Lynch's later soft core lesbian Mulholland Drive is sort of borderline too, but it's just mediocre and too weird to really be hot.

  • Eyes Wide Shut : I don't actually find the characters or what happens with them to be particularly arousing, this one gets an honorable entry simply for Kubrick's direction. The colors are so rich, the lighting so beautiful, that the film is really sexy.

  • The Dreamers : mildly hot, but just way too retarded to be arousing. Horrible acting, script, self indulgent, pointless.

  • Y Tu Mama Tambien : this is a pretty decent movie, I enoyed it, but I didn't really find it to be arousing at all. I consider it more a "coming of age" movie.

  • Red (Kieslowski) : the relationship of the main characters is not sexual, and unlike many retarded reviewers I don't think red = passion (it's fraternity , from liberte-egalite-fraternite). This movie is sexy just for the lighting, the constant rich color red, the camera work, the slow pace, the good acting, and the beauty of Irene Jacob, along with the mild titillation of voyeurism. BTW "La Double vie de Veronique" is more overtly sexual and also good.

  • Betty Blue :

  • Bound : meh, this movie is just trying way too hard. Tilly is a horrifically bad actress and Gershon only manages to be moderately sexy.

  • Romeo is Bleeding : such a bizarre awesomely broken movie, it's one of my favorites; not really sexy in general, but Lena Olin as the over the top sexual/twisted femme fatale has a few scenes that melt my pants. Not for couples.

The truly sensual movies are as much about the director and the look and feel of the movie as what happens to the characters. They also have to be good enough to be watchable just as movies.

Sexy (in no particular order) :

  • The Swimming Pool : perfect execution of the sexy neurotic thriller. Soaked in the heat and slowness of the sun, it gradually builds up sexual tension and dramatic tension. Smart enough to stay interesting.

  • Belle de Jour : a classic that's on every list. I don't really think it's an extremely erotic movie, since it is rather sad, and prostitution and unfaithfulness are never really hot, but Catherine Deneuve is so sad and sweet and slow and sophisticated that it still works.

  • Malena : we had to have some boochy-loochy movie on here, and this is the obvious choice. It's definitely cheesy and shallow, but Monica is gorgeous, it teases us nicely and builds arousal, it's well shot and not unwatchable.

  • La Belle Noiseuse : this is a very weird movie. The film is very bright and open and full of empty spaces. It's very repetetive and builds up a theme through variations kind of like a Bach composition. It's definitely not a sexy movie in the normal sense, but it does feature a lot of gorgeous naked Emanuelle Beart. The first hour I thought I was going to be totally bored despite all the nudity, but if you stick with it you actually get into the characters and it becomes quite a rich movie.

  • Sex and Lucia : gorgeous lush sexy movie. I think if you try to follow it too closely you might find it retarded and hate it, but if you just treat it like a dream and dive in and let each scene wash over you, it's irresistible.

  • Tie Me Up! Tie me Down! : pretty much every Almodovar movie is extremely sexy, just for the colors and the energy and the beautiful people. I picked this one kind of randomly; maybe "Live Flesh" would be a better choice.

  • In the Mood for Love : like many on the list, this is basically here for the director Kar Wai Wong; I could've gone with something from the Fallen Angels series, but I think this is most purely sensual of his movies. The camerawork is so gorgeous, the period setting, the costumes, everything is perfect.

  • Like Water for Chocolate : pretty cheesy, but gorgeous, and for me surrealism and fairy tales and cooking all enhance the sensuality.

  • The Hunger : this sort of breaks my rule about cheesy 80's movies. You have to kind of enjoy cheesy old horror movies for this to be watchable, but if you do you will be rewarded with the incredible Deneuve-Sarandon lesbian vampire scene. The whole movie is slow and weird and dreamy enough to put you in the right mood.

  • Damage : making a straightforward movie about sexual attraction is very difficult; lust is hard to capture without being phoney or just boring. Damage is probably the best movie ever made that's a straightforward narrative depiction of lust.

  • The Last Seduction : Fiorentino is the best modern femme fatale. Perhaps too aggressive for some female viewers.

03-28-08 - 2

The romantic relationships you have when you're older can really never match the magic you can share with someone earlier in life (up until around 25 or so). In your first few serious love affairs there are so many new experiences you can have together - really falling deeply in love for the first time, living with someone for the first time, really wanting to spend the rest of your life with someone else for the first time. And even beyond those obvious ones there are just constant new things that you can teach each other and show each other in the world because you're both young and you're still discovering and become adults. When you're 22, you can do things like just go to a fancy restaurant and it's so exciting and new, or take your first vacation that's just you and a lover flying somewhere exotic, or discover new kinds of music.

03-28-08 - 1

God damn tattoos are so fucking awful. Why are you putting this nasty dark green badly drawn fucking amateur art on your gorgeous human body. If you're a big ugly man, then WTF go for it, knock yourself out, but if you're a sexy woman then don't wreck yourself!


03-27-08 - 1

How to make casual group decisions.

Typical office conversation around noon : Where do you want to go to lunch? Uh, I dunno, whatever, where do you wanna go? Whatever. Well what about that one place? Mmm, nah, not there. Well how about blank? Nah.

I like to play "Choice Giver / Selector". This is sort of like the fair way to cut cake. One person cuts the cake into pieces, then the other person gets to choose which piece they want.

In "Choice Giver / Selector" , first someone is nominated or volunteers to be the Choice Giver. The Choice Giver should give 2-3 fair options which are reasonably distinct, eg. you can't just give the one option you want and then other straw man options. There's always protection against a bad choice giver, because the Selector gets to choose one of the options, or he can select to swap roles - he says "none of the above" and then becomes the Choice Giver and must give options himself.

In terms of a social dynamic, two details of this are important. The Choice Giver is only giving a few choices - not simply listing every possible option. By only giving 2-3 options he is expressing his own preferences, not putting all ther burden of decision making on the Selector. The Selector also can't just simply say "no" - if he doesn't like the options then he must take on the burden of giving options, and he must then offer 2-3 options that are reasonable and distinct.

I've given the example for two people, but this works in a group too. The initial Choice Giver volunteers or is nominated by the group. The group then votes on the options, or votes "none of the above". If none of the above wins the vote, then the group votes to nominate a new Choice Giver.

Of course it's quite common for people to play an informal version of Choice Giver / Selector but they leave out some of the very crucial aspects of the rules which ruins the entire balance of the game.


03-26-08 - 2

The positive effects of me writing personal things in this space are very very small. The negative effects have been quite large a few times in my life. It's pretty obviously not worth it. I should really delete all the personal stuff and never write anything personal again.

It used to make both Dan and Tiffiny very upset, not because I was writing anything personal about them or our relationship, but because they felt like I was sharing more of my inner thoughts with the anonymous internet than I was with them.

The positive effect is mainly that it leads me to email threads with people I'm close to on a much more intimate level than we would otherwise communicate. I've gotten some very good emails recently because of that, but in theory I could have those conversations with my friends without using a public blog to initiate them. In practice I probably can't.

More : 11/2007 to 03/2008

03-26-08 - 1

Well I promised that I would eventually take my controller stuff out of testdd and put in a .h so it's actually useful to people. Yeah, I'm probably not gonna do that. What I did do is fix my quartic curve solver to actually be correct, and added a "hybrid" cubic version. (hmm, I'm a retard, I just added a ComplexDouble class to cblib and then noticed that std::complex is totally fine and I should've just used that. doh)

Hybrid cubic works like this : if the target is stationary, the max time to converge is set by a user parameter. A shorter time can be used at any time as long as max accel (user parameter) is not exceeded. The result is that you get the desired behavior of the original cubic controller of reaching the target exactly within a certain time interval, and it avoids the weird quirk of sometimes taking extra long curves because it was always taking the same amount of time - now it can take a shorter time in a reasonable non-hacky way. Note that this is not some tweaked out fudgy solution. There are two user parameters : max time to converge & max acceleration on the path, and the cubic path that fits those constraints is found exactly each frame.

I wrote something about this on game-tech but I never posted it here I don't think. To me, PD controllers and this cubic controller thingy seem like very opposite ends of a theoretical spectrum of controllers. On the one end, the cubic curve is exactly symmetric. That is, the path from P0->P1 and the path from P1->P0 are identical under time reversal. The cubic goes from point A to point B in controllable finite time, and basically never overshoots. The PD is sort of the opposite in every way - it's very asymmetric, it attacks hard to cover distance fast initially, then slows down and comes in to settle very slowly (eventually infinitely slowly for a pure PD). A standard critically damped PD will definitely overshoot as part of the price to attacking quick and then stopping gently. In hand-wavey terms the PD feels very "springy" while the cubic feels very "robotic".

Now in theory it would be awesome to have this as a parameter. So if parameter is 0, you get a cubic, if parameter is 1, you get a PD, and in between you can play with how robotic or how springy your controller is. And of course that's easy. You can just blend the controllers. It's important to note why you can blend the controllers - they are stateless and linear. You could just get the acceleration from each and blend that, but that's not exact since with both the PD and the cubic I've solved them to take exact time steps, going to a discrete integrator with a constant acceleration would be lame. But of course Newton's equations are linear too, so I don't need to blend the accelerations, I can just run both controllers independently and blend the resulting position & velocities. This is now in testdd as "mode 10" ("blend").

A quick note on PD's. You generally want a critically damped PD. PD's will never exactly converge, so you need to fudge it. The best fudge I found was a "min velocity" parameter. This make the controller try to at least move you at a speed of "min velocity" towards the target, which cuts off the long exponentail tail of PD convergence and make the end part linear. Now, with this you might go ahead and try using an overdamped PD to reduce overshooting. That does work but it's a little weird looking. I can't imagine why anyone would ever want an underdamped PD, they're just gross. I couldn't find a clean stateless way to give a PD controller a max acceleration. If you do it naively, you prevent it from stopping fast enough and get horrible oscillation. If you try to prevent that naively by allowing decceleration, you get weird looking curves when your points move around in 3D space because you've created a splitting plane where your controller changes discontinuously.


03-25-08 - 6

GOD DAMN MY 5 AM WAKING UP NEIGHBOR! Fuck me, I have to move, this is fucking awful. I think she's a lesbian, but I have very little evidence so far, so that's just wild speculation at this point. Also it seems she's rich; she uses a car service all the time. Car service is one of those things that doesn't actually cost much more than a taxi, but only rich people do it.

03-25-08 - 5

I thought of a good sound-byte for my code design philosophy : "Compilation equals correctness".

Of course correctness does not necessarilly mean you are bug free, but the idea is that any "policies" which are required by your classes or functions should be enforced via the compiler as much as is reasonably possible. Incorrect coding should show up as a compiler error, not as simply something that you have to know is incorrect because you read the comment and know that "hey that's the wrong way to use that thing".

03-25-08 - 4

Thoughts on intrusive vs. non-intrusive ref counting : (intrusive = refcounted base class, non-intrusive = boost::shared_ptr or equivalent)

We'll do pros & cons in a second, but perhaps the biggest difference is philosophical. Intrusive ref counting forces you to use ref counting & smart pointers to manage those objects. Non-intrusive lets you use them or not, and lets you use a mix of different lifetime management schemes. Some people consider this an advantage for non-intrusive, but personally I consider it a big advantage for intrusive. I consider it to be much better to have a single global clear rule for lifetime management so that you can go into any part of the codebase and things are being done the same way and you are familiar with the gotchas and how things work. Also if I design some classes and managers, I consider it a big advantage that I can force my clients to manage the objects the way I intended and not do whatever they please with them. In general in coding I'd rather have fewer options as long as they are reasonable options, and be able to enforce usage that I know is correct, as opposed to allowing a variety of uses and leaving it up to the client code to be correct.

Non-Intrusive Pros/Cons :

Pro : Can easily refcount classes from outside your codebase, like FILE or Windows/OS objects, or whatever. Doing the same with an Intrusive system would require a wrapper, though it's a pretty trivial template wrapper, so this isn't really a huge difference.

Con : cannot (easily) get from the object to its refcount or smart pointer. That means you can never get from a naked pointer to the controlled pointer. That means any function which could affect lifetime, or could call anything that affects lifetime, must take a smart pointer as an argument. That in turn means you pretty much have to use smart pointers everywhere, which means the celebrated flexibility of being able to use different types of lifetime management is a bit of a phantom. Also means that all those functions that take smart pointers can't be called from object members because "this" is not a smart pointer.

Pro : becoming pretty common through boost so there's a lot of familiarity and code out there that use this stuff.

Con : speed and memory use overhead for the extra shared refcount. Obviously uses some kind of fast pooled allocator, but it's still off somewhere else in memory space, it's not right with the object, which is ugly.

Con : because the object does not own its ref count, it can't verify that it's being counted correctly or managed correctly; in general no way to enforce proper lifetime management because you don't know how your object is being managed, it's left to the client.

Intrusive Pros/Cons :

Pro : Can easily go from naked pointers to smart pointers, which means you can just pass naked pointers in functions, and also use "this". If you choose, the speed hit or threading performance hit can be very very low, because you only need to actually use smart pointers when you are doing something that affects lifetime - otherwise you can just use naked pointers.

Con : a bit ugly with multiple inheritance; either need to use virtual inheritance to refcounted base, or use macros to make concrete classes refcountable. Non-intrusive doesn't have this problem, you can just refcount whatever multiple inherited thing you want.

more ... ??? @@

03-25-08 - 3

I walked around the park today. The cherry blossoms are amazing around the park right now. Obviously there are great specimens in the Japanese Tea Gardens, but that place is mobbed with tourists, even on a gray week day, I can only imagine the horror on a sunny weekend. If you just walk around the park randomly near there there are lots more good ones, and the Botanical Garden right there also has tons of cherry blossoms and lots of other random flowering stuff.

Also saw them setting up the new Chihuly exhibit at the de Young. It doesn't open until June or something yet, and there will be a bunch of it at the Legion of Honor as well. There were tons of boxes marked Chihuly piled up so it looks like it should be pretty rad.

I love skipping.

03-25-08 - 2

The Tenderloin is the scariest, but also perhaps the most fun neighborhood in SF.

My suggested itinerary :

check out art at Shooting Gallery / White Walls
eat at Bodega Bistro, Pagolac, Thai House Express or Lahore Karachi
get drinks at Bourbon & Branch , or Rye, or Koko
go clubbing at 222 or see a band at the GAMH

03-25-08 - 1

If you want to Yelp and have it actually be useful for you, here's what you do :

1. Make an account.

2. Click "Member Search" in the upper right, and enter "Toro E". Click on his name to visit his profile.

3. In the left side bar, click "Add To Favorites".

4. Now search for places. The review from Toro Eater should be on top. Read it. Ignore the average rating and all other reviews. If there's no review from Toro Eater it probably sucks.

5. Yes, Toro Eater is pretty much only in San Francisco. If you don't live in San Francisco, every place probably sucks. Live somewhere better.


03-24-08 - 2

I did my Google interview today. It was pretty rough, it's just a long time to be talking to strangers. I want to write about it more but I'm scared of the Google NDA death squad coming after me. I got woken up repeatedly in the middle of the night last night by my neighbors, so I woke up feeling awful and was really in bad shape in the morning. I think I did pretty poorly on the first two interviews, but then did pretty well on the last two.

Google basically picks four engineers semi-randomly to grill you. That's good in that you get to meet a few different people, and they have different questions and expertises, so maybe as a whole they can get a better picture of you. That's bad in that they aren't necessarily people who are particularly good at interviewing. Doing a good tech interview is a difficult skill that takes a lot of thought and practice. Some of the guys were pretty good interviewers; some were not.

I got a few really good questions that I could tell were well thought out and made me think and work out how to do things and demonstrate competence; I'd like to post them because they were cool, but I'm afraid.

I also got two of the classic bad tech interview types of questions :

A) the brain teaser : present the candidate with a clever question that's really quite easy if they see the answer, but is just a stumper that you can't really work out unless you see the trick. These questions are horrifically bad interview questions, they don't test intelligence (except a certain kind of riddle-solving intelligence) or problem solving skills.

B) the "one true way" question : in this you ask a rather open question with various decent answers, but in your head you have an idea of the "one true way" that you believe is superior, and you try to prod and lead the candidate to come to your solution. These questions only succceed in telling whether the candidate has the same coding background and priorities as the questioner. A better way to test the same point is to just go ahead and tell them the way you're thinking of (after they give their own solution) then ask for the pros & cons or why/when you might use one or the other of the different ways to see if they really understand it.

I also have a general mental problem in those kind of situations where I tend to not give the really simple obvious answer. For simple quiz questions that should have a really simple answer, I tend to overcomplicate them. Like "how would you store 3 names?" , I'm like "well, vector < string > " , when I should just go "char * names[3]". I also spaced on what a semaphore is exactly, so :(

03-24-08 - 1

I just realized I have to tell girls I'm dating about my web page like it's an STD. I ask them to sit down. "There's something I have to tell you. You're going to find this out eventually, so I thought you should hear it from me. Umm ... I have a blog."


03-22-08 - 6

I've never presented my food well. In fact I almost intentionally present it badly, just piles of slop on an earthen plate. I feel like the food should be judged by its flavor, not the way it looks, and I'm bothered with all the amateur photographers who make mediocre food but make it look gorgeous and everyone drools over it. Of course my attitude is just retarded. I've taken a similar attitude with my personal presentation as well, my fashion & coiffure and so on. I always thought I should be judged by my actions, and not all those other trapping that people care so much about, and I rebelled against them by intentionally being frumpy. Of course that's totally retarded. People do care how the food is presented, even the smart people who mainly care about flavor, even the people that I want to impress, of course they are still affected by presentation, both subconsciously and consciously, and to intentionally flub that aspect just makes you rate lower than you deserve.

03-22-08 - 5

All these rescue chef shows are retarded because the stars are just bad teachers. They use this teaching method where they just say "don't do what you were doing, let me show you a whole new way". The lesson is not related at all to the student's experience, it's just a presentation of the "right way". That's a horrible way of teaching, but it's very common. People learn much faster if you work with the way they've already figured things out. The student is already trying to do it a certain way and has reasoned out thoughts on the subject. If you can look at what they're doing and go "okay, I see what you were going for, but here's what you did wrong.." or, "okay, I get your way of thinking about the problem, but actually you're missing this key point.." people learn much much faster if you do that.

03-22-08 - 4

I've been doing a lot of heavy deep squatting and high box jumping recently, and my knees are happier than they've been in years. The vast majority of people make the mistake of "taking it easy" on their damaged body parts. They have iffy knees or whatever and their response is to not bend them much, never do anything too athletic with them, just not push them. That provides a temporary decrease in discomfort, but it actually makes the joint even less healthy (because it tightens up, the muscles atrophy, blood flow decreases, etc.). What you want to do is to continue to use the joint as heavily as possible, but without aggravating it. You achieve that through controlled stresses.

Addendum : on the other hand I'm a retard and just moved my bed around by myself. That's a good example of a non-controlled stress which can suddenly put excessive forces on body parts that aren't well suited to it.

03-22-08 - 3

"No Country for Old Men" was really masterfully executed, and Javier Bardem is so creepy in it, but at the same time, it's just elegant emptiness. It says nothing, and it's just a standard ( Texas Sheriff / take the drug money and run / serial killer ) movie. I guess the derivative boringness of it is sort of intentional, the Coen Bros have always loved classic Hollywood movie formulas and many of their movies are just homages to standardized forms.

03-22-08 - 2

It's a good day in my world. My cold is gone. My left shoulder is almost better. The sun is out. I've been hiking and biking. I did some pretty sweet tricking, such as vaults and kick-jumps. I've been practicing the bike balance a lot, slowly getting better, working up to 5 second balances, but today it suddenly clicked for me and I did a 30 second balance. I almost don't mind red lights and stop signs because I can practice (and show off) my sweet bike stand.

03-22-08 - 1

Jordan Chavez wrote me to point out the Weak Pointer implementation I have in cblib is pretty dumb. Yeah, he's right. Let me explain a bit. I wrote an article on smart & weak pointers long ago. I think they're rad. I'm going to assume you have an intrusive refcounted because it's just so superior to nonintrusive.

Now so far as I can tell there are two major styles of weak pointer. One is where the WeakPtr actually holds a pointer straight to the object and gets cleaned up intrusively when the object dies. The other is where the WeakPtr has some kind of indirect handle to the object which becomes an invalid handle when the object goes away.

The first way where the WeakPtr actually points at the object has one big advantage : pointer accesses are very fast, in fact they're free. There's no need to look up a handle, if you have a pointer you just use it. Of course it is slower to destruct an object, because you have to clean up your weak pointers. There are various ways to implement this, but they pretty much all have pretty high memory use overhead. My Galaxy3 has a WeakPtr of this style, implemented using a doubly linked list to clean up the weaks.

The second way with a handle can be very compact in memory, but has the disadvantage of dirtying cache for any weak lookup because you have to go through an extra indirection. The implementation in cblib is based on what I did for Oddworld, and it's intended to be as small as possible in a fixed-memory console environment. On the console, the WeakPtr is 4 bytes per WeakPtr + 6 bytes per object, but you are limited to 64k objects max. The version in cblib is just a lazy port of that to 32bit indexes so it's got 8 byte WeakPtrs which is a waste. Note because of the GUID system in the cblib version, weak table entries are reused right away, so you only need as many weak table entries as you have objects, so I can statically allocated a 64k-element weak reference table.

Jordan's suggestion is basically like this :

	int index; // index to weak table

	RefCounted * ptr;
	short weakCount;
To use the WeakPtr you just look it up in the table with [index] and use the [ptr] there. When an object dies, you set its [ptr] to NULL and leave the entry in the weak table. The [weakCount] counts how many weak pointers point at that slot so you don't reuse the slot until the weak pointers go away.

WeakPtr can also update themselves for free; if you use the [index] to look up the table and you find a [ptr] that's NULL, then you just change your index to NULL and decrement [weakCount]. This is a lazy way of eventually releasing all the weak references to a slot.

In theory there are pathological things you could do to have a huge amount of unnecessary weak table entries, but that's very unlikely in practice. This method is 4 bytes for a weak pointer and 6 bytes per weak table entry - but you can't just statically allocate a 64k entry weak table for a max object count of 64k because there will be empties in your weak table.

Smart Pointers are falling a bit out of favor (not that they ever were IN favor) because of multi-threading. Multi-threading is ubiquitous in games these days so we have to handle it. Passing refcounted objects around between threads is totally fine - it just means you need to thread-protect all your ref access. This also means thread protecting the smart pointer constructor & destructor that make the ref count go up and down, as well as the weak pointer's accesses to the weak table. This can be a bit of a performance hit. In fact the weak table is especially ugly because it's memory that all your processors would have to share. The linked-list style weak pointer is actually better for multiprocessor memory access, assuming that object lookup is far more common than object deletion.

Now, this doesn't have to be a big performance problem, because I've always been a fan of the "ref holder" style of smart pointer use, rather than the full Herb Sutter exception-safe "everything takes smart pointers" style. The ref holder style looks like :

void MyFunc()
	SmartPtr sp = world.GetSomeObject();
	thingy * ptr = sp.Get();
	... do stuff on "ptr" ...
basically we're just holding sp in the function scope to hold a ref, but all our work is on a plain old naked pointer "ptr". The sp would cause one interlock to take a ref at the top and release one at the bottom, no biggy unless your function is tiny.

There are surely other threading issues that maybe people with more experience can chime in on.

For example, to be thread safe, there should not be any "Get()" function on the WeakPtr to return a naked pointer, you can only return a smart pointer and have an IsNull check, to avoid the cases where your WeakPtr returns a pointer but the object gets deleted by another thread before you use it.


03-21-08 - 2

Fucking hell. My new neighbor that just moved in has some job where she wakes up at 5 AM and gets ready then goes out slamming the front door, starts up her car and drives off around 6 AM. I know all the details because I sit there in misery listening to each sound and watching the clock tick by. Fucking people should not be allowed to live in my building if they keep fucked hours like that.

In other "this fucking building sucks and I know more about my neighbors than I want" , I talked to the boyfriend of Slutty Girl. He seems like a really sweet intelligent guy and it's made me feel even worse about the situation. I also found out more about the story. They've been dating a while. They met up in the wine country when they were both living there. He owns a house up there somehow and they lived together there. They just got bored and decided to get an apartment in the city here, but he still owns the house and goes up there a few days each week, which is why he's often not here. She got a job here and stays in the city all week. Now it all makes sense; almost every day when he goes back out to the country, she has some random guy over. It's so disturbing. I guess part of what disturbs me so much is that the boyfriend is a lot like me, sort of naive and trusting and easily run over, and Slutty Girl would seem totally normal if I didn't know about her secret life. Any of my past or future girlfriends could be just like her, and I'd never know.

03-21-08 - 1

I found a nice general page about shoulder health . It's a good read for any athlete, since prevention is so much better than recovery.


03-20-08 - 5

Dan never screwed the cap back on the toothpaste after using it. I never said anything; I mean I specifically chose not to complain about it; I feel like I tend to be that jerk who has to have everything his way and is constantly it pointing out when you do something in a way that I consider the wrong way, and it just gets really annoying to live with me because I'm always nagging about something. So I try to stop myself from being that guy. Some things that matter (like don't use my good chef's knife on any surface but woood) I will nag about but other things that I decide I can let slide (like the cap on the toothpaste) I try to just not say anything about. But the truth is it always bugged me. Every time I walked into the bathroom and saw the cap off the toothpaste it felt like a finger poking me in the ribs. I would often screw the cap on myself. And now that Dan is gone, it's one of the odd little things that I'm happy about. Ahh, every time I walk into the bathroom the cap is on the toothpaste. So relaxing. BTW I've learned absolutely nothing from this anecdote.

03-20-08 - 4

I just found a ruined Calphalon pan on the street and refinished it. It takes about an hour to resurface a pan and quite a bit of muscle work. How to restore the surface of a metal pan :

Start with a metal scouring pad (brass is good, an SOS pad is okay too). Scrub HARD mainly in a back-and-forth manner. The goal is not to get the dirt off, it's to actually remove the top layer of metal from the pan's surface, so you have to scrub hard hard hard and a long time - 5 minutes or so. Make sure you get in the corners too. There should be a large amount of fine metal powder building up in the pan, wipe it out with a towel and move on to the next step :

Metal sanding paper. You want at least 2 and preferably 3 different grits of sand paper, you want the kind made specifically for sanding metal. Start with the coarsest grit and again sand HARD all over. This is where you attack any bumps or uneven spots and try to get a smooth surface. Start with back-and-forth but finish with round-in-circles sanding. Wipe out metal filings with a towel and progress to the next smoother grit. Sand some more. As you get to the finest grit you want to start worrying about your sanding pattern. You don't want to leave big grooves in straight lines from sanding as that will make the food stick to the pan. Your goal is a mirror-like smooth finish. Try to use semi-randomized circular motions so that you are smoothing everywhere and not digging big grooves with hard back & forth sanding. When you think you've done enough, now do that same amount again. Wipe out the very fine metal dust with a towel.

Your pan should now be smooth, but don't try to use it yet. It's still full of fine metal dust. First wash with hot soapy water, and towel dry. Then pour in a little oil and rub it around the pan, then wipe out all the oil with a paper towl. If your pan is cast iron, you need to cure it. It doesn't hurt to cure other pans either. Put in a bunch of shortening (crisco) in the pan to coat, stick it in a 350 degree oven for 30 minutes (or 300 for 2 hours, meh). Wipe out the grease after it cools.

Finally we want to get the last bit of metal shavings out. We do this by cooking something to coat and throwing it out. Making a batch of scrambled eggs works fine for this, just cook them up and throw them out. Alternatively I think making a roux would work too and be cheaper. Your pan is now good as new.

03-20-08 - 3

Despite all the internet information sites, there's still no substitute for massive amounts of learned local information and the human memory. Part of the problem is that when you're trying to think of a good place to go, the criteria in your head are not so simple. You don't just think "I need a decent restaurant in Hayes Valley". You think "I need a decent restaurant somewhere near Hayes Valley (though I would go farther if necessary), it should be mid-ranged priced, classy but not stuffy or a big scene, I need to be able to get in without a reservation, it should be decently small and cozy, not a big cavernous noisy mess, somewhere I can wear jeans and a t-shirt and not feel under-dressed". There's no way to really search like that on the web and weight all the different factors. In your head you can store all this local knowledge of good places, and you can do a really nice weighting, so maybe you pick a place that doesn't even fit a single one of the criteria exactly, but it's pretty close on all of them.

After about 1.5 years in SF I'm still not even close to having the necessary local knowledge.

03-20-08 - 2

MapJack reminds me of those old dungeon crawl games like Wizardry or Dungeon Master. It would be pretty fucking rad to mod mapjack to give you a character and have random monsters pop out as you walk around the dots.

It's also rad you can link to specific views like : this cool stairway near my house

03-20-08 - 1

I haven't actually watched a movie in months. I put the movie in the player, watch about 15 minutes, get antsy and bored, boot up the computer and start writing and surfing the web while the movie plays. It really doesn't give the movie a fair viewing, but FUCK sitting and watching something is so fucking boring. Sometimes I'll do my shoulder physical therapy while watching, and that works okay. The only way I can actually sit and watch a movie is if I have a girl leaning against me to give my hands something fun to do. Yeah I guess I have ADD really bad.


03-19-08 - 2

I've been thinking a lot about Danielle and Tiffiny lately. I try to keep them out of my head to move on, but that's not really possible. I'm trying to make myself more emotionally open and honest, and that's not something you can do selectively, when you open the gates everything comes through. I just learned this word saudade which is pretty much what I feel for them. It's something I'm okay with. I don't think about them constantly and it's not holding me back in any way, it's just a slight sadness that's always present in my mind which I greet and acknowledge and choose not to pay too much attention to.

I wish I had more pictures from my past. I've never been good at taking pictures, it just seems like a pointless nuisance at the time, but after the fact I wish I had them. At the time I think "I'll never forget this, it will be in my memory forever" but in fact the memory fades and becomes only an outline - or a memory of when I could remember it.

03-19-08 - 1

Went to Briones park today; put some pictures up in my Flickr. It was pretty wonderful. Read real descriptions here : Gambolin & Kevin . I got kind of lost and accidentally did a 12 mile loop instead of 7 miles. Oops. Also I did a clean box jump onto the back of a bench and stuck the balance which was pretty sweet. There's lots of big meadows and hills there that you can just wander around off the trails. I wouldn't go there any time but spring, though.

I was trying to think of what hikes or natural things I really need to do while it's still lovely spring time. Down in San Luis Obispo the spring is really the only amazing time of year and the rest of the time it's kind of shitty. Up here that's not so true, the redwood parks are better in summer, there's not really a big green flush and wildflower bloom up here. Maybe I'll try to get down to Sunol Wilderness or Henry Coe, but they're so far from the city. Actually San Bruno Mountain right here is really nice in spring, I should try to do that before it gets dry.

I'd never been out the 24 there before. The little pocket around Orinda to Lafayette is really gorgeous, rolling hills and lots of greenery. I'm sure it's full of fucking disgusting tech company yuppie types with their knobby tired self-balancing strollers and biodegradable SUV's. But BART goes right through there which would make it a pretty sweet place to live when I'm old and married.


03-18-08 - 1

On Unit Interval Remappers

Some people have looked at the "controller/lerper" problem and mentioned interpolators. Interpolators are great things but they are not useful as controllers, because the target is not necessarilly constant, and even the "current" object is not under 100% control of the controller. The target can be jumping around, and hell the object may have other forces being applied to it as well. If you try to use an interpolator and just keep resetting it, you will have discontinuities and very ugly responses.

Anyway, it reminded me all the 0->1 : 0->1 float functions. Everybody knows so-called "smooth step" which IMO is more correctly called a hermite lerp because I'm sloppy and say "lerp" for any interpolation :

hermitelerp(t) = (3 - 2*t)*t*t
So, like you have your t in [0,1] which is your lerp parameter, and you want to get between two values A and B, instead of A + (B-A)*t , you use A + (B-A)*hermitelerp(t).

BTW amateurs can easily get into over using this. It's not necessarilly "better" than a lerp. Yes it is C1 (velocity continuous) if you require that the end points are zero velocity. It makes the curve slow at the ends and fast in the middle. Sometimes that's bad. I've seen people just tossing hermitelerps everywhere because it's "smoother" and of course that's retarded.

Of course very similar is

coslerp(t) = 0.5 - 0.5 * cos(t * pi)
which is C-inf if the end points are non-moving. Another problem with these of course is if you use them repeatedly to chase a target, you get really stair-steppy motion, you stop, speed up, slow to a stop, speed up, slow to a stop, it's dumb.

Anyway, we all know that, but if we think a bit more about our lerpers, it's obvious that there's this whole class of functions that remap the unit interval :

f : [0,1] -> [0,1]
f(0) = 0
f(1) = 1
A true remapper is also always in bounds and monotonically increasing in [0,1] :
0 <= f(t) <= 1 , for t in [0,1]
(d/dt) f(t) >= 0 , for t in [0,1]
though these last two strict requirements can often be relaxed depending on the application.

Of course the identity f(t) = t is your basic option.

I think of these as "response shapers". There are lots of times when you are really doing some kind of arbitrary response curve and you don't realize it. For example, say you want to show the user's health through color. You might do something like :

color = red + (green - red) * (health / health_max)
But really what you should have written was :
t = health / health_max;
color = red + (green - red) * response(t);
where response() is a unit interval reshaper. Of course the identity is one option, but it's not necessarily the best one.

It's particularly obvious that a response() curve belongs anywhere that you change units, because the meaning of values in one set of units is not necessarily linear in the other units. I don't really like the notion of sticking response() curves arbitrarily all over to fudge things within the same units - as long as you're in one set of units you should be able to do math to tell you what the curves should be like, but when you change units all bets are off.

One example is the one we used already - any time you convert a variable into a graphical indicator, such as color, brightness, the size of an explosion, etc. - those could easily get a response() curve. The other classic example is converting a mouse or gamepad stick to physical units. Instead of

torque = k * (stick deflection)
it should be
torque = k * response( stick deflection )

Now of course if your unit conversion is meters -> feet that's not what I mean by a unit change, and if it's something where you have a natural equation to convert units, that doesn't call for a response curve, but pretty much any time you change units using some non-physical unitful constant multiplier, it's suspicious. What makes you think those units are on the same scale that you can just use a linear conversion? You should assume you need a response curve, then you just might use the identity response.

Let's look at a few remappers. You can build a bag of tricks and get an intuition for these things and then start tossing them in the right places and understand how they affect control response.

First of all obviously you need to first map your values into 0->1 using something like :

float fmakelerper(float val,float lo,float hi) { return fclampunit( (val - lo) / (hi - lo) ); };
I wind up writing a lot of
flerp(response( fmakerlerper(val,lo,hi) ) ,lo,hi)

Now, given a few basic remappers, there are various transforms you can do which create new valid remappers.

Any time you have a remapper(t), you also have

remap2(t) = 1 - remap1( 1-t )
which is the reflection in X and Y (or the 180 rotation if you prefer). So for example if you have a curve that's always above the straight line, you can change that to a curve that's always below using 1-curve(1-t). And of course if you have any two remappers, then any lerp between remappers is also a remapper :
remap3(t,k) = k * remap1(t) + (1-k) * remap2(t)
This is a way you can expose shape control to the user or artist. Lerping a shaped curve with the identity curve is a good way to tweak the nonlinearity. Finally you can also remap a remap :
remap3(t) = remap1( remap2(t) )
Or any combination of these ways to make a new one.

The first ones that are obvious are the polynomials. Any (t ^ n) works, but we usually work with

square(t) = t*t
cube(t) = t*t*t
Intuition : these push the curve downward. That makes the response stay low longer and then shoot up at the end. This is useful for giving a linear ramp more "impact" and is used a lot in graphics. Linear ramps look boring, if you want a light to turn on over time t, something like square(t) is better. Obviously cube is the same thing but more extreme. These are also useful for providing fine control in UI elements like gamepad sticks or microscope zooms or camera focus; it makes small deflections even smaller, so that you can do very fine work, but it still lets you ramp up to a full speed change when you crank it.

Of course you can do powers of polynomials < 1 too , the basic gamma correction approximation is sqrt :

sqrt(t) = t^.5
intuitively this makes low values jump up fast, then it tails off slower. This is roughly how you convert light linear pixels to gamma corrected pixels. Of course there are also quadratic approximations of sqrt
approxsqrt(t) = t * (27 - 13*t)/14
which I optimized over [0,1]. That's slightly better than (2*t - t^2) but of course that super simple version is okay too. Note that the super simple version there is just the flip & inverse of square(t).
1 - square(1-t) = (2*t - t^2)

Then of course there are various kinds of parameterized warpers that you can expose to prefs to tweak response curves. One is the exponential :

explerp(t,k) = (e^(k*t) - 1)/(e^k - 1)

k > 0
for k very small this becomes a straight line (the identity)
as k gets larger the curve is pushed into the corner in the bottom right
for k very large it's a step function at t=1
and of course 1-exp(1-t) is useful too. (that's the same as using k < 0)

Then there's the quadratic spline remappers.

quad1(t,p) = t*t + 2*p*t*(1-t)

p in [0,0.5] is a true unit remapper
Of course this is just lerp of the identity and the quadratic remappers. More generally :
inline float fquadraticsplinelerper(const float f, const float ptX, const float ptY )
	ASSERT( fisinrange(ptX,0.f,1.f) );
	ASSERT( fisinrange(ptY,0.f,1.f) );
	ASSERT( !fequal(ptX,0.5f) );	//singularity here. 
	float bx = ptX;
	float a = (1 - 2.f*bx);
	float A = bx*bx + a*f;
	float t = (sqrtf(A) - bx)/a;
	float y = t*t + ptY*2.f*t*(1.f-t);
	return y;
will make a curve that goes through (ptX,ptY)

BTW I've only talked about unit remapping, but half-line remapping is very useful too and is a whole other topic. Two of the most useful are tanh(t) , which takes [0,inf] -> [0,1] smoothly, and log(t) which is [0,inf] -> [0,inf] but is very useful for putting things in linear importance scale when the relative magnitudes are what determine importance. For example both sound and image deltas have more linear information content in log scale.

Now we can get to what reminded me of this. Won sent me a link to Herf on Stopping . Mmmm, okay. He comes up with some funny curve. But if you look at the curve he wants, it looks just like a "smooth step" but all squished over to the left. Well, guess what, squishing to the left is exactly the kind of thing that remappers are good at. If we have some plot of a function F(x) in [0,1] and we want to make a new function that has the same endpoints but is squished to the left or the right or up and down - that's exactly what a remapper is. If you imagine having a picture of a checker board pattern, when you run one of these unit warpers, you're stretching some squares and squeezing others.

So, to make the kind of shape Herf wants, we can just use :

stopping(t) = hermitelerp( 1 - cube(1-t) )
and that looks pretty darn good. If you want more control you can use :
stopping(t) = hermitelerp( 1 - explerp(t,k) )

where k = -4 looks good to me

Just for more intuition building I'll show some other ways to get this kind of curve :

squareinv(t) = 1 - (1-t)^2

stopping(t) = hermitelerp( hermitelerp( squareinv(t) ) )
Each time you pass through hermitelerp it makes the ends slower and the mid faster, so doing it twice exaggerates that. squareinv skews the shape over to the left. So we make a shape with really flat ends and a steep middle, then we skew the whole thing to the left and that's the kind of stopping curve we want.
stopping(t) = squareinv( hermitelerp( squareinv(t) ) )
Similar, but the squareinv on the outside takes the shape and skews it upwards. Of course the squares could be cubes to be more extreme, or the squares could be lerped with identities to make them less extreme.

BTW it's a pet peeve of mine when people use physics terms to try to justify something totally hacky, especially when they get it wrong. There's absolutely nothing wrong with being unphysical and just saying "this function looks good". Herf says his "stopping" is based on physical friction; in fact he is not modeling "friction", he's modeling linear drag. Friction is not velocity dependent, it's a constant force (proportional to the normal force, which is proportional to mass in the normal case). Linear drag is a decceleration proportional to your velocity which produces the exponential slowing. Linear drag is the easiest diffeq anyone can solve : a = -k v.

While I'm ranting let me also note that linear drag is not what you usually see in nature. Drag in the air is mostly quadratic drag. That is a = - k v^2. You do see linear drag in the water at low speeds when you have laminar flow. BTW this is why the shape of boat hulls is really crucial - it's not just like a 10% change, it's the difference between laminar and turbulent flow, which changes your drag from linear to quadratic, which is like 1000X difference for fast boats. (it's actually way more complicated than this but that's the rough idea).


03-17-08 - 2

I've always known that implicit integration is like "the stable way" , but I guess I don't really grok *why* it's super stable. I guess if you look at some graphs of the potential wells of spring forces you can kind of get why it works for those, with a regular Euler integrator you are drawing these tangents that are very steep and you easily overshoot, whereas if you step ahead and take the tangent of the end-point it's milder. But that feels awfully hand wavey.

Anyway, I reminded myself that even the implicit Euler is really very rough and not what you want, because it is just evaluating the function at one point along the time step.

You want to solve (d/dt) y(t) = F(y,t) . When you discretize to steps of " h ", ideally you'd get the average F over the [ t , t+h ] interval. But if you could analytically get the average over the interval you could just analytically solve the whole thing and you wouldn't have to numerically integrate at all. Which of course I guess should always be the first step that we sometimes forget - see if you can just solve it and fuck this numerical shit.

Euler : (bad)

( y(t+h) - y(t) ) / h = F( y(t), t )

Implicit Euler : (stable, but still bad)

( y(t+h) - y(t) ) / h = F( y(t+h), t+h )

Midpoint : (pretty accurate)

( y(t+h) - y(t) ) / h = F( y(t) + (h/2)*F(y(t),t) , t+h/2 )

Implicit Midpoint : (accurate & stable)

( y(t+h) - y(t) ) / h = F( (y(t) + y(t+h))/2 , t+h/2 )

In particular, implicit midpoint will actually keep you on the correct path in phase space {p,q} when you're solving a Hamiltonian system. There will be error, but it will be parametric error in following the path, not error that makes you jump to other paths.

Of course for implicit methods you have to do algebra to solve for y(t+h) in terms of y(t) which is not always possible.

I guess this is all duh duh review. Also more generally Runge-Kutta is a much more accurate version of "midpoint" and then you could also do implicit Runge-Kutta as well.

The thing I forget sometimes is that the standard Implicit Euler that we like to use in games is not like the awesome solution. It just happens to be super stable for spring type systems, but it actually acts like an artificial damping of the system, numerically in terms of how close it gets to the right answer it's very bad, in fact it's just as bad as the plain forward Euler.

03-17-08 - 1

Gambolin' Man is a sweet Bay Area hiking blog, except that he put a bajillion pictures on the front page so it takes ten years to load and FireFox retardedly stalls out while it loads.


03-16-08 - 2

Addendum / FYI : there's been a lot of followup to this and I'll post some useable code some day soon, in like a simple .h you can just include or something.

So, Jon asked this question and I realized that I don't actually know of a good answer :

You have some state variable with inertia, we'll call it a position & velocity {P,V}
You want to drive that variable towards some target, which may or may not continue to change {TP,TV}
You want to reach the target in finite time, and if the target is still you should reach it within time K
You want both your position and velocity to be at least C0 (value continuous)
You want a "good path" which is rather hard to define exactly, but overshooting and oscillating is bad, as is unnecessarily fast accelerations; you don't exactly want to minimize the energy required to move because that will give you very loose springy motion.

(side note : this not actually an "interception" problem, in that the target point is not considered to be actually moving linearly, you need to actually hit the exact end value "TP" and when you hit it you must have the velocity "TV" , you don't get to hit the point TP + t* TV somewhere along its path ; note that if you can solve this problem then you can solve the interception problem by using a moving frame, but the converse is not true).

PD controllers are very commonly used for this in games. They're nice and smooth (they're always C1, and are also C2 if the target moves smoothly), but have a few bad properties. For one thing they never actually reach the target, they just keep getting closer forever. Furthermore, if you damp them enough to prevent overshoot, they converge very slowly. People often use a critically damped controller, but it doesn't really solve these issues (overshooting vs. overdamping), just picks a middle ground where you have a little of each problem.

So far as I know there is no standard good solution to this, which is odd because it seems like something we want to do all the time. Does anybody know if there are good solutions to this problem ? It kind of blows my mind that we don't all just have this routine sitting around.

So I made a test app with various more or less hacky solutions : testdd zip and there are algorithm descriptions in the readme

I also just added a new mode in there today which so far as I can tell is the best way to do this. It's "mode 1" the "cubic maxaccel".

It solves a cubic, which is the lowest order polynomial that hits the endpoints (and thus is C1). The cubic is constrained to never accelerate faster than a max acceleration parameter. This removes all the hacky stuff about resetting the time to get to the target when the target moves slightly. I can't find a flaw with this, it just seems to work.

Now it might also be interesting to go ahead and use a quartic curve, because then you could also keep the acceleration continuous as well. Dunno if that would actually make better looking curves because the cubic paths look pretty good already.

It's very important IMO for whatever algorithm here to not have a lot of hacks/tweaks/prefs that need tuning, because you want to be able to take your interpolator and toss it anything have it "just work". For example, from what I can tell a PID controller is just a no-go because tweaking the values is very difficult and data dependent so it's not practical to use as a generic interpolator. What I want is basically one single parameter factor for the interpolator which controls how fast the value converges to the target.

I would like to see or compare to an AI steering type of interpolator, but I don't really want to write it, so if somebody has code for something like that, please send it to me, ideally as function that plugs right into that testdd/main.cpp

Also the code I've got in there for dealing with finding the root of the quartic polynomial is kind of nasty, so if you want to fix that up go for it. In words : I need to find the lowest non-negative root of a "depressed quartic". Any kind of iterative thing like Newton's is nasty because you have start with some seed, and there's not a simple way to be sure you picked the right seed to get the lowest positive solution.

03-16-08 - 1

My family on my dad's side is partly from "State College" PA. It never occured to me until just now what a ridiculous name for a town that is. When I was a kid they said yeah we're from State College, and I was like okay, yeah, those are just words that designate a place that don't mean anything, fine, you're from there. But it actually is where the state college is. It's like naming your town "Steel Mill" or "Capital City" or "Shipping Port". Where do you live? Oh I live in "Walmart Distribution" TX.


03-15-08 - 1

I've talked before about how Web 2.0 is fucked, but cooking & food information is probably the best example, so let's look at that.

The information that exists out there currently is basically in three forms :

1. Food blogs. There's a ton of great food blogs, but they are not aggregated/indexed, so it transforms useful reference material into useless feature/diary materials.

2. Recipe collection sites like cooks.com ; These are ridiculously overloaded with just god-awful recipes. The star rating systems they use are worthless because they're dominated by morons who have zero concept of cooking. This is a big part of what makes the recipe problem so obvious, is that there's such a massive amount of bad information out there, and also a massive amount of retarded people rating things, so that any non-localized rating system is worthless.

3. Sponsored sites like FoodNetwork or major individual's pages like Joy Of Baking. These are actually the most useful places to get recipes, because the standards are high, they're searchable, and furthermore they give you the author's name which gives you context.

The first most obvious thing that's needed is a global aggregator that breaks the data atoms from blogs into a single searchable database. As usual stupid Google is not a great way to search for recipes, because pagerank is not at all what you want for sorting results, and you get tons of spurious pointers to restaurant menus and junk like that.

More importantly though you need author context. To some extent, all information in the modern era is only useful with author context. There's too much information out there for you to trust the quality of it without knowing where it came from. Now if you go to FoodNetwork or something and see actual names and know that a recipe from "Alton Brown" is probably good, that's fine, but that is only possible with a limited network, for the whole web there will be too many people and they may have only a few recipes so the information is too sparse for you to remember them all.

Obviously what you need is some kind of Collaborative Filtering, but really that's not ideal, what you want is a Network of Trust. The big difference is I get manual control over my links and I also get to see where they're coming from. So if I want I can just say thumbs up/ thumbs down on various recipes and get CF recommendations, but I can also manually say "I trust Alton Brown" and automatically get recommendations from him and also everything that he trusts. Seeing WHY you got something recommended is also very useful ; some recipe that looks nasty is recommended to you, but on the side it shows a network connection from Alton Brown and you can conclude that in fact this bizarre looking recipe is worth trying.

The reason recipes are such a strong example of how Web 2.0 is fucked is that there's tons of good information out there that you should be able to find, and currently you just really can't.

The whole thing that's great about Web 2.0 is that there are actual communities with actual people, and you can get to know them, and when you read bits of content you can see who it came from, and that little "who" piece of information is incredibly valuable. The problem is it's balkanized and they aren't working with collaborative filters. When they do have collaborative filters, they just run an impersonal CF algorithm which throws away the valuable revelation of the "who" to the user which is the value of the community.


03-14-08 - 4

I've gotta stop drinking so much. I really don't like it. Well I like it, but I don't like being hung over at all. The problem is I just cannot handle social interaction without booze, so for me meeting people = boozing. I mean, have you ever hung out with people without alcohol !? it's unbearable, you're totally aware of everything, everyone is so dorky, then you get bored, it just doesn't work. I've known some really cool coders that I really liked, but I could never really be friends with them because they didn't drink, and me trying to socialize without boozing = kill self.

03-14-08 - 3

Walking around today I noticed a hispanic guy going through recycling bins taking out bottles. That's not unusual at all here, quite a few people seem to be scraping by in the city on the refuse of the rich. He was wearing an iPod as he did it. Something is seriously wrong with the universe when the guy who's scavenging in people's trash for a living is listening to an iPod.

03-14-08 - 2

Whether you can talk to someone really has nothing to do with what you have in common. Dan used to try to set me up on "blind dates" with guys because I'm such a fucking loser I need help meeting guys. She would say "oh, you would like this guy, he rides bikes". Oh really? So our conversation will be like :

"hey, uh, I hear you like bikes"
"me too"

Sweet conversation, glad we did that.

03-14-08 - 1

Today I was watching a Rick Roll (I love getting Rick Rolled, I'm so addicted to that song, some times I just Rick Roll myself), and I realized that I dance a hell of a lot like Rick Astley. That's mostly not a good thing. In fact the only time it is a good thing is when I'm trying to do a Rick Astley impersonation.


03-13-08 - 3

Yikes. 45 year old short bald guy just came out of Slutty Girl's apartment. So gross. He had a spring in his step. WTF is she thinking. That's the first time I've seen anyone outside of the 20-30 hipster demographic.

03-13-08 - 2

The iPod Shuffle is such a shitty product. For one thing, the physical design of the buttons is retarded. It's this cute looking circle, which I guess is intended to mimic the look of the normal ipod wheel thing, but it's not a wheel it's just 4 buttons. The consequence of the circle is that it's rotationally symmetric so you can't tell what you're touching and it makes it really hard to hit buttons by touch. A correct design would be to just have four buttons with distinct shapes, like the Playstation control buttons.

Furthermore, the software/firmware is shit. You should be able to set up playlists that choose a subset of the songs, and then select between the playlists. This could easily be done with no additional buttons, just a shift-click or something like that. Also the sequential play mode is a pain because you have to just skip through a ton of songs, but if there was a "skip album" shift-click it would be pretty useable.

03-13-08 - 1

The best condom review site is "RipNRoll" because they have the detailed dimensions of every condom. There's actually a lot of size variation among normal sizes, and it's not printed on the boxes, so without teh interweb you don't know what you're getting. For example, most of the much-celebrated Japanese condoms are slightly smaller than normal. Some others are slightly bigger than normal, and none of it is on the packaging.


03-12-08 - 4

How to take the arms off your Aeron :

No chair should have arms, they're horrible for you. Fortunately it's pretty easy to take them off the Aeron. First remove the back of the chair. There are 4 large hex bolts that hold the back on to the two posts coming up from the seat. Those come off easily. Now you can see the inside of the sliding arm assembly. Each arm is held on by one bolt. These are star-head ("torx") bolts which nobody has the tool for. Fortunately if you jam a hex key in there really hard and press firmly while you turn you can get these off. The arm tightening wheels will spin so you have to hold them in place to unscrew these bolts. The sliding arm assembly is sprung and the tightening wheels have ball bearings, so you want to be careful as you get close to taking these bolts out - try to hold it all together as you pull the bolts so it doesn't all go flying. The arms should fall right off now and you can put the back back on.

LOL on the negative side I keep reaching for the arm to brace myself when I sit down and almost just wiped out when I reached and grabbed air.

Also a small bonus, the really ugly Aeron looks much better without arms.

03-12-08 - 3

My resume' really doesn't look that hot. If you just saw it and didn't know me, you would never hire me. I'm pretty much relying on my rep in the industry, which is still pretty decent, but I think I'm fucked for jobs that aren't somehow connected to games. For one thing I have the big gap for the last 3 years, but even before that it looks like I moved around a lot. I really didn't move around that much, there are things like Eclipse->Wild Tangent which is the same company, and then there were part time and summer jobs when I was a kid, and then I was at Oddworld a long time, until they closed, but if you look at the resume there are all these 3-6 months stints and it looks like I don't stick it out when times are tough. The only place I really bailed on was Surreal because it was a total disaster.

Also as usual I've timed things to be at exactly the wrong phase of the economy. Yay me.

Also I'm not sure who to use as a reference. Past employers are meh. I guess I'll just use past coworkers and general game industry people.

03-12-08 - 2

My left shoulder is pretty much painless & has full mobility, but there's a weird loud pop that it makes in some movements. I've been reading up on symptoms and I think I may have a torn labrum (probably a SLAP), which would fit with the mode of injury (falling on outstretched arm). I've learned that arms are fucking lame and if you're falling, don't try to stop your fall by putting out your hand, just take the impact with your face, you'll take less damage. Of course the treatment for a SLAP is probably nothing.

I've been making myself go out to cafes a bit to read & write just so I'm not sitting alone in my house all the time. I fucking hate it. In general I hate all these "activities" that people do which are basically just leaving home to sit around or stand around somewhere else. If I'm gonna be doing nothing, I'd rather be at home where I can get naked and browse the internet and watch TV and dance around and cook. If I'm gonna have all the trouble and pain of leaving home, I wanna fucking do something. Let's build a fort from cardboard boxes, or paint graffiti, or practice cheerleader routines, or get naked and have a feather tickling fight, or fucking something, anything.

My computer has started doing some of those weird awful things that inevitably happens to Windows if you use it a lot. Like sometimes I'll hit standby and the machine will just seem to lock up for 5 minutes and then eventually do it. WTF Windows.

03-12-08 - 1

Projects I'd like to work on :

ImDoub. See notes here. One of the complicating things is that you really need the ML (maximum likelihood) double, not the Ln-norm lowest error double.

Network of Trust. See many previous writings here. Basically this is just collaborative filtering, but where I'm in control of my own graph, more like my "friends network" on facebook or something. This is only really awesome if it's global, which is connected to :

Unified blog / recommender accounts. The balkanization of Web 2.0 is severely reducing how useful it could be. All the seperate accounts and friend connection networks means that your return on effort is linear, instead of exponential like it should be. (an open friend system is exponential because making a new connection linearly increases your rate of new connections). This is mainly a political problem, but if you built a really good framework for it, maybe you could convince more of the little forums to drink the koolaid. Obviously the sites are opposed to this because their proprietary user-created data is their treasure.

My text -> blogger app so I can write my blog like this but have it update to blogger or whatever.


03-11-08 - 7

Netflix Prize notes , Part 5 : References

Links :

Torch3 The Dream Comes True
TinySVM Support Vector Machines
The SMILES System
The Robotics Institute
The OC1 decision tree software system
TANAGRA - A free DATA MINING software for teaching and research
tan classifier - Google Search
SVM-Light Support Vector Machine
SVM Page of Keerthi's Group at NUS, Singapore
Support Vector Machines Platt
Support vector machine - Wikipedia, the free encyclopedia
Supervised Learning and Data Mining (Tom Dietterich)
Slope One Predictors for Online Rating-Based Collaborative Filtering
Singular value decomposition - Wikipedia, the free encyclopedia
Singhi papers
Rapid - I - Home
Radford Neal's Home Page
Publications, Roman Lutz
Problems of the multivariate statistical analysis
Principal components analysis - Wikipedia, the free encyclopedia
Predictive complexity
Performance Prediction Challenge Results
Penn Data Mining Group Publications
PDP++ Home Page
PCP - Pattern Classification Program
PB NNets
Nonlinear Estimation
Neural Networks & Connectionist Systems
Neural Network FAQ, part 5 of 7 Free Software
Neural Network FAQ, part 2 of 7 Learning
Netflix Update Try This at Home
Netflix Challenge
Moonflare Code Neural Network
MLIA Technical Papers
Michael Pazzani Publications on Machine Learning, Personalization, KDD and Artificial Intelligence
Machine Learning and Applied Statistics - Home
LSI - Latent Semantic Indexing Web Site
Linear discriminant analysis - Wikipedia, the free encyclopedia
LIBSVM -- A Library for Support Vector Machines
Learning with Kernels
LAPACK++ Linear Algebra Package in C++
Kernel Machines
Keerthi npa
Joone - Java Object Oriented Neural Engine
Introduction to Radial Basis Function Networks
Independent component analysis - Wikipedia, the free encyclopedia
Huan Liu
Geoffrey E. Hinton's Publications in reverse chronological order
Fast Artificial Neural Network Library
FANN Creation-Execution - Fast Artificial Neural Network Library (FANN)
ensembles decision trees - Google Search
Decision Trees
Decision tree - Wikipedia, the free encyclopedia
DBLP Gavin C. Cawley
Data Mining (36-350) Lecture Notes, Weeks 4--7
CS540 Fall 2005 Lectures
Collaborative Filtering Resources
Collaborative Filtering Research Papers
CiteULike bpacker's library
Chris Burges's Publications
Ben Marlin - Research
Balaji Krishnapuram
Atkeson Local Learning
Amazon.com The Elements of Statistical Learning Books T. Hastie,R. Tibshirani,J. H. Friedman
Aldebaro's publications
- Netflix Prize Home

03-11-08 - 6

Netflix Prize notes , Part 4 : other stuff @@ ... coming soon ...

user bias from the value they were shown Filling blanks Dimensional reduction, k-Means ; SVD and NNet reduction. Similars that don't necessarily have great overlap. Hidden germinators.

03-11-08 - 5

Netflix Prize notes , Part 3 : Local weighting of predictors

So, last time we built some CF predictors and talked a bit about combining them, so let's go into more detail on that. This is going to be even more rough because it's the stuff I was still actively working on when I stopped.

First, how you would make an optimal global weighting of predictors. This is easy because our metric is an L2 error, we can just user an LSQR fit. For each query in the test set, we run the CF predictors. We put all the CF predictors, [I] through [VII] into a row vector. We also tack on the movie average and user the average and a constant 1.0. We take all these rows from all the test queries and tack them together to form a big matrix A and solve to minimize |Ax - b|^2 , where b are the actual ratings. In reality we want a damped solver as mentioned previously to reduce overtraining. We also want to do this with the N-chunk cross training previously mentioned. The solution "x" gives us the coefficients for each of the predictors in our row. Since this is a totally linear model if we train N times we can just blend those together to make a single average set of coefficients. This is the optimal set of coefficients for training our predictors. Note that this is also a really good way to see what predictors are working well - they will have coefficients that are large, the ones that are worthless will have coefficients near zero.

But it's immediately obvious that a global weighting can be beat. Different queries have different characteristics, and will have different optimal weightings. For example, you might do one query and find that there are similar movies that are very very similar, but there are no really good similar users for that query. In that case you will want to not use the N-user prediction at all, and just use the N-movie prediction. Local weighting will let us select for the predictor that is more suited to the query, but of course you don't just want to select you want to do some linear combo.

Now, this is very similar to combining "experts" and there's lots of theory on that. It does not however fit the normal experts model, because we aren't getting feedback as we go along, and we don't really have time continuity of the queries, so we can't track how well experts are performing and adjust their weight using those schemes.

One thing we can do is estimate the error of the various predictors. Think of each predictor as a predictor for data compression. Rather than just predict a value, you need to predict a probability spectrum. Let's make Gaussian probabilities, so we need to predict a center and a width. The center is just the prediction value we already did in Part 2. We need a width which minimizes the entropy. eg. if we're more confident we can have a smaller width, if we're very unsure we must predict a large width. This is the same as estimating an MSE for each predictor.

Once we have an MSE estimate for each predictor, we can combine them in various ways which I've written about in the past here. For example, we could weight each predictor by

e^(- Beta * MSE)

for some constant Beta to be optimized
or by
(1 / MSE)
and of course normalize by dividing through by the sum of the weights.

How do we get an error estimate for each predictor? We train a learner to output MSE given some good conditioning variables. The most obvious thing is the "distance" to the similar movies & users that we computer to find our similar neighbors. Small distance should be a good indicator of confidence. Another pretty obvious one is the sdev of ratings in the L[] matrix along the 0 row and 0 column (after adjusting using slope1). That is, if you look at the "N movie slope 1" it's just a weighted average of an array. Instead I can take the weighted sdev of that same array and it tells me how much variation is in it. I don't just use that sdev as my estimate of the MSE of the predictor, I use it as a training variable. So I gather 4 or so interesting variables like this which are good indicators, and now I have 4 float -> 1 float supervised learner to train. In my research I didn't work out exactly what the best learner is here; you can certain easily use a Neural Net or anything like that. I just used a linear fit.

BTW I should note that any linear fit can easily be made polynomial by adding cross terms. That is, say I have a 4 float input set to learn on. I want to try the quadratic functions, you just add (4*5)/2 more terms for all the squares you can make from the inputs. The actual learner is still linear, but you find quadratic functions in the inputs. Of course you can do other functions of the inputs besides polynomials, but you can't learn the parameters of those functions.

So, to summarize : take measurements of the neighborhood that reflect confidence in the different predictors, feed them to a trained learner that will guess the MSE of a predictor (one learner for each predictor), use the MSE to weight the various predictions.

Aside : when combining all these predictors, you do not really want to get as many good predictors as you can to combine, because they will be very similar and not complementary. What you really want are predictors that offset each other well, or predictors that are very good for certain types of neighborhood which can be reliably identified. This is something I learned from the Volf switching compression work - if you just try to weight together the best compressors in the world, it doesn't help because they're all PPM variants and they don't complement. Instead if you weight in some really shitty compressors that are very different, like LZ77 or LZP or Grammars, you get a big benefit because they complement the main PPM coder well.

Aside #2 : at this point I'll mention there are some weird things with pretending we're predicting a Gaussian, because the values are only discrete 1-5. Instead of just averaging together the centers of the preds, we could treat each one actually as a probability spectrum, and add the spectrums. This is messy, I'm not going to get into it here.

Now let's look at a totally different way to do the same thing. Again we're trying to find local weights for the different predictors. This time we're going to think of it as a local regression problem. Instead of finding global coefficients with an lsqr to weight our predictors, we want to find a bunch of regions, and find coefficients in each region.

There are a few ways to do this kind of local regression. The first is to use a classification learner. First let's simplify and just worry about the "N movie slope-1" and "N user slope-1" predictors, since they are the most valuable and it gives us just two categories, let's call these "M" and "U" here. We look at all our training samples and see if the actual value is closer to the M or the U value. We label each value with an M or a U depending on which is closer. Now our goal is to train a learner which can look at the local neighborhood and predict a category, either M or U.

To train this learner we want to use the same kind of things as we used to estimate the local MSE - stuff like the distance to the similar user and the similar movie, the sdevs, etc. Also note that this learner must be nonlinear - if it's just a linear learner like a simple NNet or LSQR then we may as well just do a global linear fit of coefficients. We have a bunch of conditioning values for the learner, these are the axes of some high dimensional space. In this space are scattered the training samples labelled with M's and U's. We want to find the pockets where there are lots of M's or lots of U's and put down splitting curves. The ideal solution for this is an SVM (Support Vector Machine) with a radial basis kernel. SVM's are pretty ugly to train, so before you try to train one you need to try to get all your variables nice, get rid of any that are redundant, remove bias & scale. There are also ugly things about this, one is that to use an RBF machine you need to define a distance in this space, but your values are not in the same units and it's not obvious how to combine them.

I'm not gonna really talk about SVM's, you can follow the links below if you like, but I will wave my hands for a second. To do category prediction the simplest thing you can do is find a single plane that puts one category on the front and one category on the back. That's the basic linear categorizer, which can be found with neural nets among other ways. The basic trick of SVM's is to note that the plane test only relies on a dot product, and mathematically any time you have a dot product you can replace it with any other valid "kernel". This is the "kernel trick" and it's very common and useful. So for example a Gaussian RBF is a kernel so it can be used in the exact same kind of "plane" fit, instead of you use e^-k*(a-b)^2

If you can make the SVM that makes guesses about where the good "M" and "U" regions are, you still want to actually weight each one rather than selecting. There are a few good ways to do this. One is the SVM can tell you the distance from a query point to the nearest threshold surface. You can then turn this distance into weights some way or other. Another way is instead of training one SVM, you make a bunch of random subsets of your training data and train N SVM's. Now when you query a value you query all N of them, and your weights are the # that voted M and the # that voted U. This method of making a big randomized ensemble and using it to vote is very powerful. It's also great in practice because SVM's are evil on huge data sets, so doing N seperate trains on chunks of (1/N) samples is much better.

Now the final way I'll talk about is using a Decision Tree to select areas and do local regression. Now, with the whole M/U labeling thing we could've totally used a decision tree there as well, so you could apply the same ideas. All of these techniques are just things in your bag of tricks that you can use anywhere appropriate.

A Decision Tree is basically just a binary tree on various attributes of the neighborhood. If we again think of this high dimensional space where the axes are the useful properties about our neighborhood, the decision tree is just a BSP tree in that space. We want to build a BSP tree that takes us down to leaves where the neighborhoods within a leaf have similar attributes. Trying to do this greedy top-down does not work very well because you have to search tons of directions in high-D space and it's hard to find axes that provide good seperation. Instead what we're going to do is just randomly build a deep tree and then prune leaves.

To make our DT, we just start splitting. To make each split, we choose a direction in parameter space at random. We then find the centroid of the values in that direction and put a plane there, then go to each child and repeat. I made 8-deep trees (256 leaves). Now we want to collapse leaves that aren't too useful. The reason we need to do this is we are worried about overtraining. We want our tree as small as possible.

What we do is within each leaf, we do an LSQR to linear fit the predictors and find the best coefficients in that leaf. We store the error for these. Then for each node that's just above the leaves, we look at the error if pruned it - we put the values together and do an LSQR on the union and measure the error there. It's very important to account for the parameters of the model when you do this, because the extra parameters of the leaves always let you fit the data regardless of whether it's a good leaf or not

C = # of LSQR coefficients to weight predictors
P = # of parameters used in DT plane

Q(leaves) = (sum of errors in L0) / (N0 - C - P) + (sum of errors in L1) / (N1 - C - P)

Q(pruned) = (sum of errors in L0+L1) / (N0 + N1 - C)

L0,L1 = leaf 0 and 1
N0,N1 = # of items in leaf 0 and 1

prune if Q(pruned) < Q(leaves)

These trees can kind of suck because they are randomly made, but as usual we can throw a big hammer at them. Just randomly make a ton of them. Then we can test how well they work by trying them on some training data. The ones that suck we can just throw out. The rest we can average.

BTW an alternative to the DT thing is a kind of k-Means. You pick seed samples, map each sample to the closest seed, this defines clusters. Then you do the LSQR on each cluster. Then to query you interpolate between the fits at each seed. There are various ways to interpolate. Some good ways are just to weight each seed using the distance to that seed using either 1/D or e^(-k*D). Again instead of trying hard to find really good seeds you're probably better off just making a big ensemble by randomly picking seeds and then throw out the bad ones.

The final usage looks like this : build a local neighborhood L[] and compute the basic CF predictors as well as the distances and other good parameters to select predictors with. This defines a point in parameter space. Use the point to walk down the DT (or ensemble of DT's) that we built to find a region. That region tells us coefficients to use to weight all our predictors, which gives us our output prediction value.

old rants