I'm doing some fun math that I'll post in the next entry, but first some little random shit I've done along the way.
First of all this is just a handy tiny C++ functor dealy for doing numerical integration. I tried to be a little bit careful
about floating point issues (note for example the alternative version of the "t" interpolation) but I'm sure someone with
more skills can fix this. Obviously the accumulation into "sum" is not awesome for floats if you have a severely oscillating
cancelling function (like if you try to integrate a high frequency cosine for example). I suppose the best way would be to
use cascaded accumulators (an accumulator per exponent). Anyhoo here it is :
template < typename functor >
double Integrate( double lo, double hi, int steps, functor f )
{
double sum = 0.0;
double last_val = f(lo);
double step_size = (hi - lo)/steps;
for(int i=1;i <= steps;i++)
{
//double t = lo + i * (hi - lo) / steps;
double t = ( (steps - i) * lo + i * hi ) / steps;
double cur_val = f(t);
// trapezoid :
double cur_int = (1.0/2.0) * (cur_val + last_val);
// Simpson :
// double mid_val = f(t + step_size * 0.5);
//double cur_int = (1.0/6.0) * (cur_val + 4.0 * mid_val + last_val);
sum += cur_int;
last_val = cur_val;
}
sum *= step_size;
return sum;
}
BTW I think it might help to use steps = an exact power of two (like 2^16), since that is exactly
representable in floats.
I also did something that's quite useless but kind of educational. Say you want a log2() and for some reason
you don't want to call log(). (note that this is dumb because most platforms have fast native log or good
libraries, but whatever, let's continue).
Obviously the floating point exponent is close, so if we factor our number :
X = 2^E * (1 + M)
log2(X) = log2( 2^E * (1 + M) )
log2(X) = log2( 2^E ) + log2(1 + M)
log2(X) = E + log2(1 + M)
log2(X) = E + ln(1 + M) / ln(2)
M in [0,1]
Now, obviously ln(1 + M) is the perfect kind of thing to do a series expansion on.
We know M is "small" so the obvious thing that a junior mathematician would do is the Taylor expansion :
ln(1+x) ~= x - x^2/2 + x^3/3 - x^4/4 + ...
that would be very wrong. There are a few reasons why. One is that our "x" (M) is not actually "small". M is equally likely
to be anywhere in [0,1] , and for M -> 1 , the error of this expansion is huge.
More generally, Taylor series are just *NEVER* the right way to do functional approximation. They are very useful mathematical
constructs for doing calculus and limits, but they should only be used for solving equations, not in computer science. Way too
many people use them for function approximation which is NOT what they do.
If for some reason we want to use a Taylor-like expansion for ln() we can fix it. First of all, we can bring our x into range
better.
if ( 1+x > 4/3 )
{
E ++
x = (1+x)/2 - 1;
}
if (1+x) is large, we divide it by two and compensate in the exponent. Now instead of having x in [0,1] we have x in [-1/3,1/3] which is
better.
The next thing you can do is find the optimal last coefficient. That is :
ln(1+x) ~= x - x^2/2 + x^3/3 - x^4/4 + k5 * x^5
for a 5-term polynomial. For x in [-epsilon,epsilon] the optimal value for k5 is 1/5 , the Taylor expansion. But that's not the range of
x we're using. We're using either [-1/3,0] or [0,1/3]. The easiest way to find a better k5 is to take the difference from a higher order
Taylor :
delta = k5 * x^5 - ( x^5/5 - x^6/6 + x^7/7 )
Integrate delta^2 over [0,1/3] to find the L2 norm error, then take the different wrst k5 to minimize the error. What you get is :
x in [0,1/3] :
k5 = (1/5 - 11/(18*12))
x in [-1/3,0] :
k5 = (1/5 + 11/(18*12))
it's intuitive what's happening here; if you just truncate a Taylor series at some order, you're doing the wrong thing. The N-term Taylor
series is not the best N-term approximation. What we've done here is found the average of what all the future terms add up to and put
them in as a compensation. In particular in the ln case the terms swing back and forth positive,negative, and each one is smaller than
the last, so when you truncate you are overshooting the last value, so you need to compensate down in the x > 0 case and up in the x < 0 case.
using k5 instead of 1/5 we improve the error by over 10X :
basic : 1.61848e-008
improved : 1.31599e-009
The full code is :
float log2_improved( float X )
{
ASSERT( X > 0.0 );
///-------------
// approximate log2 by getting the exponent from the float
// and then using the mantissa to do a taylor series
// get the exponent :
uint32 X_as_int = FLOAT_AS_INT(X);
int iLogFloor = (X_as_int >> 23) - 127;
// get the mantissa :
FloatAnd32 fi;
fi.i = (X_as_int & ( (1<<23)-1 ) ) | (127<<23);
double frac = fi.f;
ASSERT( frac >= 1.0 && frac < 2.0 );
double k5;
if ( frac > 4.0/3.0 )
{
// (frac/2) is closer to 2.0 than frac is
// push the iLog up and our correction will now be negative
// the branch here sucks but this is necessary for accuracy
// when frac is near 2.0, t is near 1.0 and the Taylor is totally invalid
frac *= 0.5;
iLogFloor++;
k5 = (1/5.0 + 11.0/(18.0*12.0));
}
else
{
k5 = (1/5.0 - 11.0/(18.0*12.0));
}
// X = 2^iLogFloor * frac
double t = frac - 1.0;
ASSERT( t >= -(1.0/3.0) && t <= (1.0/3.0) );
double lnFrac = t - t*t*0.5 + (t*t*t)*( (1.0/3.0) - t*(1.0/4.0) + t*t*k5 );
float approx = (float)iLogFloor + float(lnFrac) * float(1.0 / LN2);
return approx;
}
In any case, this is still not right. What we actually want is the best N-term approximation on a certain interval. There's no need to
mess about, because that's a well defined thing to find.
You could go at it brute force, start with an arbitrary N-term polynomial and optimize the coefficients to minimize L2 error. But that
would be silly because this has all been worked out by mathemagicians in the past. The answer is just the
"Shifted Legendre Polynomials" . Legendre Polynomials are defined on [-1,1] ; you can shift them to any range [a,b] that you're
working on. They are an orthonormal transform basis for functions on that interval.
The good thing about Legendre Polynomials is that the best coefficients for an N-term expansion in Legendre
polynomials are just the Hilbert dot products (integrals) with the Legendre basis functions. Also, if you do the infinite-term
expansion in the Legendre basis, then the best expansion in the first N polynomials is just the first N terms
of the infinite term expansion (note that this was NOT true with Taylor series). (the same is true of Fourier or any orthonormal series;
obviously it's not true for Taylor because Taylor series are not orthonormal - that's why I could correct for higher terms by adjusting the
lower terms, because < x^5 * x^7 > is not zero). BTW another way to find the Legendre coefficients is to start with the Taylor series,
and then do a least-squares best fit to compensate each lower coefficient for the ones that you dropped off.
(note the standard definition of Legendre polynomials makes them orthogonal but not orthonormal ; you have to correct for their norm as we
show below :)
To approximate a function f(t) we just have to find the coefficients : C_n = < P_n * f > / < P_n * P_n >.
For our function f = log2( 1 + x) , they are :
0.557304959,
0.492127684,
-0.056146683,
0.007695622,
-0.001130694,
0.000172362
which you could find by exact integration but I got lazy and just did numerical integration. Now our errors are :
basic : 1.62154e-008
improved : 1.31753e-009
legendre : 1.47386e-009
Note that the legendre error reported here is slightly higher than the "improved" error - that's because the Legendre version just uses
the mantissa M directly on [0,1] -
there's no branch test with 4/3 and exponent shift. If I did that for the Legendre version then I should do new fits for the ranges [-1/3,0] and [0,1/3] and
it would be much much more accurate. Instead the Legendre version just does an unconditional 6-term fit and gets almost the same results.
Anyway, like I said don't actually use this for log2 , but these are good things to remember any time you do functional approximation.