02-10-09 - Image Compression Blues

Good lord the world of image compression is so screwed up. There's no standard image test set (the old Kodak images are archaic, and even then people use different variants of Lena, this new test images set is pretty good but it's not standard so fuck), there's no standard error measure (people even compute RMSE and PSNR differently), and really we should be using a perceptual measure, but again there's no good standard perceptual measure (it is nice to see that things like SSIM are catching on - however SSIM is a bit vague in its specification of blocking, and there are various implementations that do it differently, so we're back in the fucked up comparing apples-to-oranges methodology).

Making it all worse is that people keep making new shitty standards and claiming they "look good to their eyes". I mentioned before that the HD Photo guys were saying some sort of silly things about error. Well guess what, it sucks. I just found this nice benchmark with graphs that shows HD Photo doing much worse than even old baseline JPEG (!!) under the SSIM metric. WTF, how can you do worse than JPEG !? Bush league IMO.

I also wasted some time this morning looking at PGF (libPGF) . PGF is a semi-open wavelet library. It does have some good properties. The code is actually very simple and semi-readable. Compression performance is a bit better than baseline JPEG. On the minus side, compression performance is not anywhere close to state of the art. Even my very simple/fast "cbwave" beats it handily.

BTW looking at the PGF source code this is what it seems to do :

It uses a very simple small integer lifting wavelet transform. It's 5/3 tap transform, perhaps it's the Le Gall transform which is also used in JPEG2000 ? It does not do anything smart about memory flow for the subbands, it's basically a bad brute-force implementation, which means "cbwave" can beat it easily for speed.

The coder is kind of interesting. It's a bitplane based coder with no entropy coder. It just uses a certain kind of bit-packing that makes small streams when there are lots of zeros, it's similar to the old EZW or SPIHT type of zero-tree coding just with bit sequences. Obviously these codecs have implicit modeling and "entropy coding" built in to the bit sequence spec, they just avoid the arithmetic coder. The method in PGF works by breaking the subbands into blocks, choosing a linear walk order on the block, and then doing linear RLE on the bitplanes. The significance bitstreams are basically a few scattered ones with big blocks of zeros, and the RLE is just coding that out.

Another thing I found that I wasn't aware of is the new T.851 variant of JPEG1 ; basically it's a new arithmetic coder called Q15 stuck on the back end of JPEG instead of Huffman or the old QM coder. The IJG is pushing for this, I don't really know what the status is. The performance should be fine. At low bit rates a deblocking filter helps a lot and could make this a decent choice.


nothings said...

The thing about T.851, if you look at their results, is it shows only insiginficant improvement over the QM coder (in fact, it's worse on some images). Unless the patent status is better, it hardly seems worth it.

cbloom said...

Yeah, that is the point, Q15 is supposed to be "open".

cbloom said...

To be more informative - Q15 is patented by IBM, but they are supposedly "playing nice" this time and have made it free for use in JPEG-T851 , but not for other uses.

Regardless of how much you believe that or not, the whole point of the new standard is to get away from the QM-coder patents, not because it's so much better than the old jpeg-ari.

Also I should say if you just want to make a JPEG-ari yourself and don't care about it being standard, it's quite trivial to stick on any of the many free arithcoders on the back end of JPEG.

nothings said...

The JPEG group formed in 1986 and the standard came out in 1992, so I would think it any relevant patents would expire between 2006 and 2012. That seems a pretty short period to try to get a new standard in for.

old rants