The predictors all work on a ring, that is, they wrap around [0,uint_max] so you need to use the right uint size for your pixel type. To make this work I just took my 8-bit code and made it a template, and now I work on 8,16, and 32 bit pixels.
RRZ without any changes does pretty well on 16 bit data :
Original : (3735x2230x4x2) 66,632,400 Zip : 38,126,464 PNG : (*1) 26,002,734 JPEG-2000 : 22,404,146 JPEG-XR : 19,783,184 RRZ default : (-m5 -z3 -fa -l0) (*2) 24,169,080 My filter 4 + Zip : (*3) 21,880,451 RRZ with zip-like options : (-m3 -z4 -f4 -l0) 20,907,541 RRZ optimized : (-m3 -z5 -f4 -l1) 17,626,222 My filter 4 + LZMA : 16,011,226*1 : I ran pngcrush but couldn't run advpng or pngout because they fail on 16 bit data.
*2 : min match len of 5 is the default (-m5) because I found in previous testing that this was best most often. In this case, -m3 is much better. My auto-optimizer finds -m3 successfully. Also note that seekChunkReset is *off* for all these RRZ's.
*3 : filter 4 = ClampedGrad, which is best here; default RRZ filter is "adaptive" because that amortizes against really being way off the best choice, but is usually slightly worse than whatever the best is. Even when adaptive actually minimizes the L2 norm of prediction residuals, it usually has worse compression (than a uniform single filter) after LZH because it ruins repeated patterns since it is chosing different predictors on different scan lines.
Note that I didn't do anything special in the back-end for the 16 bit data, the LZH still just works on bytes, which means for example that the Huffman gets rather confused; the most minimal change you could do to make it better would be to make your LZ matches always be even numbers - so you don't send the bottom bit of match len, and to use two huffmans for literals - one for odd positions and one for even positions. LZMA for example uses 2 bits of position as context for its literal coding, so it knows what byte position you are in. Actually its surprising to me how close RRZ (single huffman, small window) gets to LZMA (arithmetic, position context, large window) in this case. It's possible that some transpose might help compression, like doing all the MSB's first, then all the LSB's, but maybe not.
ADDENDUM : another thing that would probably help is to turn the residual into a variable-byte code. If the prediction residual is in [-127,127] send it in one byte, else send 0xFF and send a two byte delta. This has the disadvantage of de-aligning pixels (eg. they aren't all 6 or 8 bytes now) but for small window LZ it means you get to fit a lot more data in the window. That is, the window is a much larger percentage of the uncompressed file size, which is good.
As part of this I got 16-bit PNG reading & writing working, which was pretty trivial. You have to swap your endian on Intel machines. It seems to be a decent format for interchanging 16 bit data, in the sense that Photoshop works with it and it's easy to do with libPNG.
I also got my compressor working on float data. The way it handles floats is via lossless conversion of floats to ints in an E.M fixed point format, previously discussed here and here . This then lets you do normal integer math for the prediction filters, losslessly. As noted in those previous posts, normal floats have too much gap around zero, so in most cases you would be better off by using what I call the "normal form" which treats everything below 1.0 as denorm (eg. no negative exponents are preserved) though obviously this is lossy.
Anyway, the compressor on floats seems to work fine but I don't have any real float/HDR image source data, and I don't know of any compressors to test against, so there you go.
ADDENDUM: I just found that OpenEXR has some sample images, so maybe I'll try those.
ADDENDUM 2 : holy crap OpenEXR is a bloated distribution. It's 22 MB just for the source code. It comes with their own big math and threading library. WTF WTF. If you're serious about trying to introduce a new interchange format, it should be STB style - one C header. There's no need for image formats to be so complex. PNG is over-complex and this is 100X worse. OpenEXR has various tile and multi-resolution streams possible, various compressors, the fucking kitchen sink and pot of soup, WTF.
3 comments:
I was supposed to add OpenEXR support to stb_image, but totally flaked :(
Won, I've thought about doing that a few times too, linking against OpenEXR is such a pain in the ass that I'd like to avoid doing that in NVTT. However, implementing all the OpenEXR features doesn't seem to be trivial either. Maybe having support for loading only the files that most applications generate by default is not too hard.
Something that I'd also like to do is to extend Thatcher's PSD reader to support 16 bit. It seems that should be fairly trivial...
Post a Comment