02-19-09 - PNG Sucks

If you're a game developer and you want to ship your game with compressed lossless images (which you should want - it makes the install faster and smaller, and you take less space on disk, and your game loads faster because decompression is faster than disk IO - it's win win win) - PNG is a natural choice. The good thing about PNG is it's widely supported now, and it's relatively simple (though still ungodly complex for what it does). The bad thing is it really sucks for compression in silly ways.

One problem is that the actual LZ is Zip. Zip is really bad old technology. I know they used it cuz its own, blah blah. Anyway. You are probably already compressing your game content with some kind of LZ (you should be). Maybe CAB or LZMA. Both of those knock the living piss out of Zip. But if you Zip the image first (via PNG) then CAB or LZMA can't work on it. Instead, just leave your image alone - leave it in BMP - and let the CAB/LZMA/whatever do its LZ !

The only thing missing is the "DPCM" (silly name for doing delta from neighbors). Again PNG has some pretty awful choices for DPCM filters.

We could make a much better DPCM using a larger neighborhood, but for speed and simplicity we will only consider the 1-ring (like PNG), that is N, W, NW :

W   ?

There are only two logical symmetric values that we can make from our neighborhood :

grad = N + W - NW;
avg  = (N + W)/2

"grad" is the gradient predictor, the simple planar fit. The set of all symmetric linear predictors can thus be specified by one parameter :

pred(t) = avg * t + grad * (1-t)

PNG provides pred(t) for t = 0 and 1, but 0.25 , 0.5 amd 0.75 are all very good values too (and can be implemented with shifts), and in fact they beat 0 and 1 very handily on most images.

Note that "grad" is actually a perfect predictor for horizontal and vertical stripes. That is :

B ?

predicts B

B ?

predicts A

The only time you would actually want a "N" or "W" pure predictor is in a weird case where pixels were correlated in one direction but not the other. For example if you had an image made from interlacing many images such that each horizontal line came from a different source image, then the "W" predictor would win. I have yet to find a single image where pred(t) is not best or very close to best.

Note that you could also obviously do an adaptive DPCM predictor in the style of CALIC that looks at the neighborhood edges and gradients and ranges and chooses different predictors. But that's an awful lot of complexity and would make it hard to put our simple DPCM into MMX or whatever.

So, this is what I'm roughly putting in Oodle for lossless texture compression. I just run a DPCM on the texture samples to turn them into deltas, and then I just jam the texture into my file stream, which LZ compresses everything. It's very easy to do, it's very fast, and it handily beats PNG.

ASIDE : this is only tangentially related, but the whole idea of the "planar" gradient predictor is very useful for predicting and extrapolating images. I used something similar in Galaxy3 to do normal map extrapolation (more about this in the next post). Obviously you can go past 1st-derivative prediction and do 2nd-derivative quadratic gradient prediction. I recently discovered a really sweet paper by Peter Lindstrom on Spectral Predictors which tabulates the best continuous predictor for various support shapes. (in image compression you always have the simple L-shape support because you are scanning in that order, but in other applications like normal map extrapolation you can have any possible support shape).

ADDENDUM : I should note there are like a million ways to beat this. That's not the point. The point is that it's *so* easy to do this, and it's very fast, so why not. Some obvious ways to beat it :

Divide image into 32x32 tiles and pick the best DPCM on each tile. To pick a DPCM do the least-squared best fit to find the optimal linear predictor. Bust least-squares is minimizing MSE which is not what you want, you want the minimum-rate predictor, so use MRP. Use more than a 1-ring neighborhood. etc. etc. (BTW none of this really changes the complexity of the decoder much, but you are definitely getting into the long compression tail of diminishing returns).


Ethatron said...

I recommend reading "Prediction based on backward adaptive recognition of local texture orientation and poisson statistical model for lossless & near-lossless image compression" in this context. It can be reduced to manhattan-distance of 1, and performs very well if not measured against blending or switching predictors, as well being simple. MRP is too slow, even for deployment you don't want to wait 4 hours for a 512x512 map to compress. MRP by the way is only a refactored "Asymetric Lossless Image Compression" by Saywood and Memon, who also had his fingers in CALIC.

cbloom said...

Thanks, somehow I missed both of those papers and they're pretty good.

The "BAROLO" predictor is kind of similar to GLICBAWLS or such, but simpler.

The "ALIC" ideas are pretty standard these days but was very good in 95. The paper is also pretty short and missing crucial details.

As for ALIC being MRP, well "Asymetric Lossless Image Compression" seems to just suggest that using LAD (minimum L1) should give you a minimum-rate-predictor for laplacian sources, the MRP paper actually works out how to compute MRP through weighted L2 minimization, which is a far better defined proposal.

cbloom said...

But anyhoo, the win from going beyond the simple scheme presented here is very small. I'd like to work on lossless image compression some day, I have some ideas eating at my brain wanting to get out, but it's just so pointless.

The only practical improvement would be to take 32x32 blocks and pick the best predictor on each one. Also you may as well use the 2-ring neighbors because that doesn't really slow you down much.

Ethatron said...

There is a follow-up paper to the "ALIC", I just can't find it, it's a bad mess with all these PDFs around and you only have this stupid index-thingy around. I wish there would be a plug-in allowing nice management of journal-papers.

There is some work based on the MRP which may interest you too: "A High Performance Lossless Image Coder".

> The "BAROLO" predictor is kind of similar to GLICBAWLS or such, but simpler.

Yes I agree, the BAROLTO is some kind of super-stripped-down 2-point LS, with just all parameters choosen empirically (which context, how much of it, and how).

I checked it out in the framework and he indeed is almost a drop-in replacement of the 1&2-point interpolators.

> but it's just so pointless

:) If you're not the Guiness-Book type, it's indeed pointless to hunt for the most best superior abracadabra compressor. But you were, weren't you? ;)

But I personally still see a lot of satisfaction in making really weird cool image compressors like this one: "A Successively Refinable Lossless Image-Coding Algorithm", also with fingers of Memon. When I read that paper I just thought "Karamba!". You know I have my teeth deep in progressive image compression.

And or just making a much affordable image compressor. I just extended PW to 14bit and dropped in YCbCr RCT (8+9+9). The result is that I now start to kick RKIM from Malcom, which was allways unbeaten. In a lot of cases PW drops size as much as 20% under JPEG2000 lossless (think progressive!).

cbloom said...

Wow the compression.ru guys have an excellent compilation of lossless image papers :


old rants