11-07-08 - Randoms

I'm really sick and kind of out of it, so this might not be totally coherent, but here goes :

I randomly found this : Nick Chapman's renderer blog ; he's just some dude who wrote a ray tracer and blogged about it; and now apparently it's A commercial ray tracer called Indigo . Anyway the blog is amusing and has lots of links to good papers and such. I've always wanted to write a physically accurate raytracer with all-frequency light propagation. It seems pretty easy and fun.

Drew asked me about random numbers so I was searching around and found Pixelero . Pixelero has lots of rally awesome Flash code. In particular he's got some nice graphs of the distributions you can make by doing some simple transforms on random numbers : here . More on this in a second.

GoodPracticeRNG by D.T. Jones is a good short simple paper on random number generation and talks about the basic issues for uniform random generation.

Drew asked about generating gaussian or other probability distributions of random numbers. I'm mainly interested in approximate distributions that are tweakable and useful for random events in games. We don't really want true Gaussians that have infinite domain, we want something that's kind of like a Gaussian over a finite range, with a hump in the middle at lower at the edges.

There are lots of methods for this in the literature, but it's mainly pretty hard core and exact stuff. I'm interested in clever little hacky tricks. If you want some background in the serious stuff, the most awesome reference is : Non-Uniform Random Variate Generation book by Luc Devoye . this handbook seems to be a short version of some of the same material. The beginning is a good read for anyone. ( new link )

I'll briefly mention if you want a true Gaussian everyone likes the polar form of the Box-Muller transform. You can look on Wikipedia for more info about this stuff.

Transforms are a handy way to distort probability distributions. Typically you'll be generating a uniform random with a standard random number generator, then you want to transform it to create a nonuniform distribution.

Say you want generate a random variable X with a probability distribution P(X). You want to do a random draw where the chance of each is P(X). To do this you sum the P's to make a cumulative probability function, C(X) = P(X) + C(X - step). (or C(X) = integral P(X)). I assume we're on the interval [0,1] so C(0) = 0 and C(1) = 1.

Now you just generate a random R = rand() in [0,1] , and you find what "bucket" it goes in, that is find X such that C(X - step) < R < C(X).

That search could be slow, but notice that if we could invert C, then it would just be a function call : X = C^-1(R).

Now, some cumulative probability functions are explicitly invertible. Luc Devoye lists a few common useful ones in his book. Others are not invertible directly, but by adding an extra variable or munging them somehow they are invertible, such as the Gaussian with the Box-Muller transform.

Low-order polynomials are invertible, but they're a pain. If your probability distribution is cubic, like a bezier curve, the CDF is quartic, and to invert it you have to solve a quartic equation analytically. That's possible but ugly. For example, a simple hump probability would be P(x) = x*(1-x) (unnormalized). The CDF is 3x^2 - 2x^3 (normalized). You can invert that directly by solving the cubic, but it's ugly.

A handle simple case is to approximate the CDF as piecewise linear. Then you can trivially invert each piecewise linear portion. So you just look at R and select the linear region you're in and then do a simple slope linear invert thing to make X.

Another approach is to just to play directly with C^-1. C^-1 is a kind of unit interval remapper. It has the properties C^-1(0) = 0 and C^-1(1) = 1. (for a symmetric probability distribution it also has other constraints). We should be quite familiar with unit interval remappers and we can just play around with things here. You can tweak directly on C^-1 and see what kind of shape of probability distribution that gives you.

For example, C^(-1)(t) = t , the identity function, is just uniform constant probability. Anywhere that C^-1 is flatter corresponds to higher probability, anywhere it's steep corresponds to low probability. To make something like a Gaussian that has low probability at the edges and high probability in the middle, what you want is a symmetric function that's steep at first, then flattens out, then is steep again. Like a "smooth step" but reflected across the 45 degree axis.

One such function is "sinh". You can look at graphs of sinh to see what I mean. It's not ideal, I'm sure there are better choices to play with, but sinh works. (btw you might think x^3 would work; no it does not, because it becomes completely flat at X = 0 which corresponds to infinite probability).

You can create various shapes with sinh by using different parts of it : curve = sinh( x * K ) / sinh( K ) , where K is a parameter that affects the curve. As K -> 0 it becomes linear, as K gets larger it becomes more severe.

Another option is to use a cubic curve. For a symmetric probability distribution you also have that C^-1(0.5) = 0.5 and you only need to design have the curve, the other half is given by reflection. You thus can design with f(0) = 0, f(0.5) = 0.5, and then set the slope you want at 0 and 0.5 and that gives you a cubic. The slope inversely corresponds to setting the probability P(X) at those points.

The mathematics of combining random numbers is quite interesting. Maybe if I wasn't sick this would be more obvious and less interesting, but right now it seems really cute.

A random unit float is a box filter. That is, if you plot the probability distribution it's a square step on [0,1].

Adding random numbers is equivalent to convolution of the probability distributions.

Add two random unit floats and you get a linear step filter, that's a convolution of a box with a box. If you add another random unit, you convolve with the box again and get a quadratic step filter. Add another random you get a cubic. Note that these are also the polynomial approximations of the Gaussian. I believe I've talked about building up the Gaussian from polynomials like this before. In any case it's in my Filter.h in cblib.

To lerp probability distributions you random select. eg. lerp(P1,P2,t) is if ( rand() < t ) return rand1 else return rand2.

Pixelero showed some interesting stuff with multiplying randoms. Apparently addition is convolution, I'm not sure what multiplication does exactly.

You can also create distributions by doing random selection vs. other randoms, stuff like x = rand1(); if x < rand2() return x; By using functions on the conditional you can create lots of shapes, but I'm not sure how to understand what kind of shapes you get. Luc Devoye's book has got some complex conditional tests like this that generate all kinds of shapes of probability distributions.

Some other good links I found : basic Gaussian rand , stocc source code with lots of random generators , anuran automatic code gen for random generators


Autodidactic Asphyxiation said...

Table based approach for probability distributions:


cbloom said...

Semi-related :



old rants