06-21-10 - RRZ PNG-alike notes

Okay I've futzed around with heuristics today and it's a bit of a dead end. There are just a few cases that you cannot ever catch with a heuristic.

There is one good heuristic that works 90% of the time :

do normal filter
try compress with LZ fast and MinMatchLen = 8
try compress with LZ normal and MinMatchLen = 3

if the {fast,8} wins then the image is almost certainly a "natural" image. Natural images do very well with long min match len, smooth filters, and simple LZ matchers.

If not, then it's some kind of weird synethetic thing. At that point, pretty much all bets are off. Synthetic images have the damnable property that certain patterns repeat, so they are very sensitive to whether the LZ can find those patterns after filtering and such. But, a good start is to try the no-filter case with normal LZ, and perhaps try the Adaptive, and you can use Loco or Non-Loco depending on whether the normal filter chose loco post-filter or not.

But there are some bitches. kodak_12 is a natural image, and we detect that right. The problem is the best mode {N,N+L,A,A+L} changes when you optimize the LZ parameters, and it changes by quite a lot actually. Many modes will show N+L or A+L willing by quite a lot, but the right mode is N and it wins by 10k.

ryg_t.train.03.bmp is the worst. It is a solid 10% better with the "Normal" mode, but this only shows up when you do the LZ optimal parse; at any other setting of LZ all the modes are very close, but for some reason there are some magic patterns that only occur in Normal mode which are very good for the optimal parse - all the other modes stay about the same size when you turn LZ up to optimal, but Normal filter gets way smaller.

Okay, some actually useful notes :

There are some notes on the PNG web pages that say the best way to choose the filter per row is with sum of abs. Oh yeah? I can beat it. First of all, doing sum of abs but adding a small penalty for non-zero helps a tiny bit. But the best thing is to do entropy per row, and add a penalty for non-zero. You're welcome.

The filters N and (N+W)/2 are almost never best as whole-image filters, but are actually helpful in the adaptive filter loop.

I reduced my filter set down to only 5 and it hurt very little. Having the extra filters is basically free in terms of the format, but is a pain in the ass to maintain if you need to write optimized SIMD decoders for every filter on every platform. So for my own sanity, a minimum set of filters is preferrable.

BTW I should note that the fact that you have to tune minMatchLen and lzLevel is an indicator of the limitations of the optimal parse. If the optimal parse really found the best LZ stream, you should just run Optimal and let it pick what matches it wants. This is an example of it finding a local minimum which is not the global minimum. The problem is that the Huffman codes are severely different if you run with MML = 3 or 5 for example. Maybe there's a way around this problem; it requires more thought.

No comments:

old rants