Comments on cbloom rants: 06-17-09 - Inverse Box Sampling - Part 2

This is what's in my old code, but I don't...

2012-11-01T10:51:16.376-07:00

This is what's in my old code, but I don't vouch for the 1 term solution. For some reason I recall thinking that 2 terms produced much better results and was the minimum, but I don't recall why exactly I though that.

(by "term" I mean the number of free variables, so "1 term" is actually a 4-tap discrete filter; symmetry and sum determine the remaining taps)

#if DOWNSAMPLE_TERMS == 3

//static double c_downCoef[DOWNSAMPLE_TERMS+1] = { 1.26746, 0.08675, -0.372, 0.01779 };
static double c_downCoef[DOWNSAMPLE_TERMS+1] = { 1.31076, 0.02601875, -0.4001217, 0.06334295 };

#elif DOWNSAMPLE_TERMS == 2

static double c_downCoef[DOWNSAMPLE_TERMS+1] = { 1.25 , 0.125, - 0.375 };

#elif DOWNSAMPLE_TERMS == 1

static double c_downCoef[DOWNSAMPLE_TERMS+1] = { 0.95, 0.05 };

#else

static double c_downCoef[DOWNSAMPLE_TERMS+1] = { 1.0 };

#endif

Thanks for advise and comments. The ryg's art...

2012-10-31T14:06:37.715-07:00

Thanks for advise and comments.
The ryg's articles are very interesting to read, although some of the tricks explained are only applicable to the "blur" filter he exposes (especially the "filter in constant time"), and seems unlikely to benefit inverse box sampling. But that's nonetheless very instructive.

By the way, are the static coefficients for n=1 {5/4, -1/4} ?

The way to do 2d filtering is always to do 1d filt...

2012-10-31T12:48:08.040-07:00

The way to do 2d filtering is always to do 1d filtering twice. Ryg blog has some good stuff on fast filtering.

You want to be doing MMX loads of 8 pixels at once, then you're just sliding a window along the row doing muls and accumulates.

I'm more on the "speed" part than qu...

2012-10-31T12:07:57.444-07:00

I'm more on the "speed" part than quality, although it's always possible to borrow cheap quality improvements, as long as RT is not jeopardized.

So on first sight, I tend to agree with your suggestion.

Now, i understand your formula in the context of 1D, but it's less clear for me how it translates into 2D.

If i do understand correctely, for n=2, it means it will be necessary to sample a full 6x6 block to calculate the value of the downsampled inner 2x2 block.
This translates into an equation of 36 input, instead of 4 (for the basic case). At the very least, it means 9x slower, probably more since the coefficients are not as simple.

That's why i'm also interested in the more simple n=1, with just 2 coefficients. It would translate into "only" 16 inputs, still 4x slower, but a kind of middle ground compared to n=2. So if n=2 proves too slow, maybe n=1 will be good enough.
However, the "static" coefficients for n=1 are not present in your article, if i'm not mistaken.

Well, the simple { 1.25 , 0.125, - 0.375 } filter ...

2012-10-31T11:44:11.838-07:00

Well, the simple { 1.25 , 0.125, - 0.375 } filter certainly can be done in real time, and is a decent win. An SSE/MMX implementation should be super fast. It depends exactly where you want to be on the quality/speed tradeoff.

I guess this method is applicable to the "dow...

2012-10-31T09:41:04.526-07:00

I guess this method is applicable to the "downsampled Co/Cg channels" of Humus texture format, since they are intended to be stretched 2x using bilinear filter.

I wonder if it is suitable for real-time downsizing though.
Simply averaging the 2x2 box is a trivial task. Your results with n=1 seems to prove their is a significant quality gain to look for, but is it achievable in real-time ?

Comparison to standard filters : r:\>bmputil m...

2009-06-22T09:17:50.194-07:00

Comparison to standard filters :

r:\>bmputil mse lenag.256.bmp bilinear_down_up_0.bmp rmse : 15.5437 psnr : 24.3339

r:\>bmputil mse lenag.256.bmp bilinear_down_up_1.bmp rmse : 13.5138 psnr : 25.5494

r:\>bmputil mse lenag.256.bmp bilinear_down_up_2.bmp rmse : 13.2124 psnr : 25.7454

r:\>bmputil mse lenag.256.bmp bilinear_down_up_3.bmp rmse : 13.0839 psnr : 25.8302

BoxFilter
filter 2 : lenag.256.bmp : // rgb rmse : 15.544 , gray rmse : 15.544

CubicFilter
filter 3 : lenag.256.bmp : // rgb rmse : 15.919 , gray rmse : 15.919

MitchellFilter B = 1/3
filter 4 : lenag.256.bmp : // rgb rmse : 14.956 , gray rmse : 14.956

MitchellFilter B = 0
filter 5 : lenag.256.bmp : // rgb rmse : 14.527 , gray rmse : 14.527

GaussianFilter
filter 6 : lenag.256.bmp : // rgb rmse : 15.656 , gray rmse : 15.656

SincFilter
filter 7 : lenag.256.bmp : // rgb rmse : 14.623 , gray rmse : 14.623

It's probably worth comparing your result to w...

2009-06-19T19:04:36.303-07:00

It's probably worth comparing your result to what you would get by just applying any of the standard mipmapping filters without paying any attention to the bilerp reconstruction, to have some baseline.

I never published anything since I wasn't actually able to get any interesting results.

I think the interesting aspect of the full solve is the opportunity to introduce non-linearities; if you're doing linear programming to solve the linear least-squares problem, you can introduce additional linear constraints, e.g. you can attempt to limit a given output value to a particular range, say the max range of N of the neighboring source pixels, to try to limit ringing, or similarly you could apply an output constraint x_n0 < xn1 if your analysis indicates that the pixels in that area should be monotonically increasing.

I'm not sure any of this will improve PSNR--ringing tends to occur in the solved output because by displacing a value "too far" the midpoint value tends to come out better--so it's more likely in the service of perceptual quality.