I was remembering how modern LZ's like LZMA (BitKnit, etc.) that (can) do pos&3 for literals might like bitmaps in XRGB rather than 24-bit RGB.
In XRGB, each color channel gets its own entropy coding. Also offset bottom bits works if the offsets are whole pixel steps (the off&3 will be zero). In 24-bit RGB that stuff is all mod-3 which we don't do.
(in general LZMA-class compressors fall apart a bit if the structure is not the typical 4/8/pow2)
In compressors it's generally terrible to stick extra bytes in and give the compressor more work to do. In this case we're injecting a 0 in every 4th byte, and the compressor has to figure out those are all redundant just to get back to its original size.
Anyway, this is an old idea, but I don't think I ever actually tried it. So :
PDI_1200.bmp LZNA : 24-bit RGB : LZNA : 2,760,054 -> 1,376,781 32-bit XRGB: LZNA : 3,676,818 -> 1,311,502 24-bit RGB with DPCM filter : LZNA : 2,760,054 -> 1,022,066 32-bit XRGB with DPCM filter : LZNA : 3,676,818 -> 1,015,379 (MML8 : 1,012,988) webpll : 961,356 paq8o8 : 1,096,342 moses.bmp 24-bit RGB : LZNA : 6,580,854 -> 3,274,757 32-bit XRGB: LZNA : 8,769,618 -> 3,022,320 24-bit RGB with DPCM filter : LZNA : 6,580,854 -> 2,433,246 32-bit XRGB with DPCM filter : LZNA : 8,769,618 -> 2,372,921 webpll : 2,204,444 gralic111d : 1,822,108 other compressors : 32-bit XRGB with DPCM filter : LZA : 8,769,618 -> 2,365,661 (MML8 : 2,354,434) 24-bit RGB no filter : BitKnit : 6,580,854 -> 3,462,455 32-bit XRGB no filter : BitKnit : 8,769,618 -> 3,070,141 32-bit XRGB with DPCM filter : BitKnit : 8,769,618 -> 2,601,463 32-bit XRGB: LZNA : 8,769,618 -> 3,022,320 32-bit XRGB: LZA : 8,769,618 -> 3,009,417 24-bit RGB: LZMA : 6,580,854 -> 3,488,546 (LZMA lc=0,lp=2,pb=2) 32-bit XRGB: LZMA : 8,769,618 -> 3,141,455 (LZMA lc=0,lp=2,pb=2) repro: bmp copy moses.bmp moses.tga 32 V:\devel\projects\oodle\radbitmap\radbitmaptest radbitmaptest64 rrz -z0 r:\moses.tga moses.tga.rrz -f8 -l1
Key observations :
1. On "moses" unfiltered : padding to XRGB does help a solid amount (3,274,757 to 3,022,320 for LZNA) , despite the source being 4/3 bigger. I think that proves the concept. (BitKnit & LZMA even bigger difference)
2. On filtered data, padding to XRGB still helps, but much (much) less. Presumably this is because post-filter data is just a bunch of low values, so the 24-bit RGB data is not so multiple-of-three structured (it's a lot of 0's, +1's, and -1's, less coherent, less difference between the color channels, etc.)
3. On un-filtered data, "sub" literals might be helping BitKnit (it beats LZMA on 32-bit unfiltered, and hangs with LZNA). On filtered data, the sub-literals don't help (might even hurt) and BK falls behind. We like the way sub literals sometimes act as an automatic structure stride and delta filter, but they can't compete with a real image-specific DPCM.
Now, XRGB padding is an ugly way to do this. You'd much rather stick with 24-bit RGB and have an LZ that works inherently on 3-byte items.
The first step is :
LZ that works on "items" (eg. item = a pixel) LZ matches (offsets and lens) are in whole items (the more analogous to bottom-bits style would be to allow whole-items and "remainders"; that's /item and %item, and let the entropy coder handle it if remainder==0 always; but probably best to just force remainders=0) When you don't match (literal item) each byte in the item gets it own entropy stats (eg. color channels of pixels)which maybe is useful on things other than just images.
The other step is something like :
Offset is an x,y delta instead of linear (this replaces offset bottom bits) could be generically useful in any kind of row/column structured data Filtering for values with x-y neighbors (do you do the LZ on un-filtered data, and only filter the literals?) (or do you filter everything and do the LZ on filter residuals?)and a lot of this is just webp-ll