12/16/2006

12-16-06 - 3

How to combine two (continuous) expert predictions :

Say you have two experts. They make predictions which have a Guassian error. The expert provides both the prediction (the center of the Gaussian) and his best estimate of the accuracy of the prediction (the sdev of the Gaussian, which is the sqrt of the variance), call this P & S, so you have {P1,S1} and {P2,S2}.

We're going to make a combine prediction by guessing a weight for each expert (here W1 and W2). The combined prediction is then :

Pred = (W1 * P1 + W2 * P2) / (W1 + W2)

(or) Pred = P1 + (W2 / (W1+W2) ) * (P2 - P1)
Pred = P1 + F * (P2 - P1)
F = 1 / (1 + W1/W2)

This latter form is more useful in practice because it allows you to apply arbitrary scaling to each term so you can run them through an lsqr or nnet or whatever. (assuming W1 and W2 never go to zero; if they do then just choose the other pred)

The ideal combination of experts in general is a context-dependent problem, but a general good way to weight them is via the entropy of the prediction. If you knew the instantaneous entropy of each prediction, H, the ideal weight would be :

W = e^( - Beta * H )

for some Beta which in general depends on the problem. In practice, Beta = 2 is usually very good.

The entropy of a Gaussian can be analytically integrated. The answer is :

H(Gauss) = (1/2) * ln(2pi) + ln(S)

Where S = the sdev (sigma), recall. Since we assumed our predictors were Gaussian we can use this. But we can simplify :

W = e^( - Beta * H ) = C * e^( - Beta * ln(S) ) = C * ( S^-Beta )

Where we put a bunch of constants in C, and they don't matter because it's an overall scaling of W which divides out.

F = 1 / (1 + W1/W2)
F = 1 / (1 + S1^-Beta/S2^-Beta
F = 1 / ( 1 + (S2/S1)^Beta )

If we use V = S^2 (V is the "variance" of the Gaussian, or the MSE), and choose the common case of Beta=2, then we have a very simple expression :

F = 1 / ( 1 + V2/V1 )

or F = V1 / (V1 + V2)

and this is the same as W1 = V2 and W2 = V1

Which is in fact a very good way to combine Gaussian predictions.

No comments:

old rants