4/04/2013

04-04-13 - Oodle Compression on BC1 and WAV

I put some stupid filters in, so here are some results for the record and my reference.

BC1 (DXT1/S3TC) DDS textures :

All compressors run in max-compress mode. Note that it's not entirely fair because Oodle has the BC1 swizzle and the others don't.

Some day I'd like to do a BC1-specific encoder. Various ideas and possibilities there. Also RD-DXTC.

I also did a WAV filter. This one is particularly ridiculous because nobody uses WAV, and if you want to compress audio you should use a domain-specific compressor, not just OodleLZ with a simple delta filter. I did it because I was annoyed that RAR beat me on WAVs (due to its having a multimedia filter), and RAR should never beat me.

WAV compression :

See also : same chart with 7z (not really fair cuz 7z doesn't have a WAV filter)

Happy to see that Oodle-filt handily beats RAR-filt. I'm using just a trivial linear gradient predictor :


out[i] = in[i] - 2*in[i-1] + in[i-2]

this could surely be better, but whatever, WAV filtering is not important.

I also did a simple BMP delta filter and EXE (BCJ/relative-call) transform. I don't really want to get into the business of offering all kinds of special case filters the way some of the more insane modern archivers do (like undoing ZLIB compression so you can recompress it, or WRT), but anyhoo there's a few.


ADDED : I will say something perhaps useful about the WAV filter.

There's a bit of a funny issue because the WAV data is 16 bit (or 24 or 32), and the back-end entropy coder in a simple LZ is 8 bit.

If you just take a 16-bit delta and put it into bytes, then most of your values will be around zero, and you'll make a stream like :

[00 00] [00 01] [FF FF] [FF F8] [00 04] ...
The bad thing you should notice here are the high bytes are switching between 00 and FF even though the values have quite a small range. (Note that the common thing of centering the values with +32768 doesn't change this at all).

You can make this much better just by doing a bias of +128. That makes it so the most important range of values (around zero (specifically [-128,127])) all have the same top byte.

I think it might be even slightly better to do a "folded" signed->unsigned map, like

{ 0,-1,1,-2,2,-3,...,32767,-32768 }
The main difference being that values like -129 and +128 get the same high byte in this mapping, rather than two different high bytes in the simple +128 bias scheme.

Of course you really want a separate 8-bit huffman for alternating pairs of bytes. One way to get that is to use a few bottom bits of position as part of the literal context. Also, the high byte should really be used as context for the low byte. But both of those are beyond the capabilities of my simple LZ-huffs so I just deinterleave the high and low bytes to two streams.

11 comments:

won3d said...

You are beating RAR in one important aspect: not having an embedded exploitable bytecode engine:

http://blog.cmpxchg8b.com/2012/09/fun-with-constrained-programming.html

For deltas, one thing that Jeff said that makes lots of sense is to segregate the sign bits since they're all random, and the magnitudes end up having an exponential-ish distribution. Obviously you still have the top/bottom byte issue.

One random thing I'd been wondering about for delta-coded things was if you knew the bounds of the original signal, could you use the previous value as something that predicted the sign? If you know you're at the maximum (minimum) value there's no way to go but down (up). And the middle is all random anyway.

cbloom said...

"For deltas, one thing that Jeff said that makes lots of sense is to segregate the sign bits since they're all random"

The good work on this is still CALIC and JPEG-LS. They do some clever things with post-delta distributions to ensure that the sign bit actually *is* random. Normally it is not, but if you bias correctly you can make it random. The other clever bit is to flip your post-delta values so that you always put the longer tail on the positive side (or vice-versa, doesn't matter as long as it's consistent).

"If you know you're at the maximum (minimum) value there's no way to go but down (up)"

Yeah, for sure. On 16-bit WAV I'm sure this doesn't matter because reaching the edge is so rare, but on 8-bit BMP type data it can be measurable, because pixel values do hang out around 00 and FF a lot.

cbloom said...

"You are beating RAR in one important aspect: not having an embedded exploitable bytecode engine:"

Ah, craziness! Presumably this is why Google now makes it troublesome to send RARs in email. Grumble.

Brian said...

"Presumably this is why Google now makes it troublesome to send RARs in email. Grumble."

Hah, though do you really use RARs? Isn't it an archaic archiving format that "no one" should use? I of course mean in preference of 7z or zip with lzma/etc. Just learning more in depth about archive formats, so please excuse what might be ignorance.

cbloom said...

Sure I use RAR. It just works, and who cares about super compression ratios anymore? (*)

Disks are cheap, the internet is fast. It's not like the old days when you didn't want to spend another $1000 to get another 1 MB of hard disk, or had to send data over a 2400 baud modem, then you really cared about max compression.

(* = of course there are lots of good reasons to care about compression, but none of them apply to just home users. Content providers want to get their bandwidth down; distributors need to fit data on a DVD; mobile providers want downloads to be tiny)

Actually as a home user I rarely use anything but zip.

The only data on my home disk that's annoying me are all the super-huge RAW photos from my digital camera. I'd like to have a RAW-specific compressor, that might actually be worth using because it would put a serious dent in the size of my backup set. (and backups are still ridiculously slow)

nothings said...

I don't think there's any evidence the rar bytecode engine is exploitable. At least there isn't any at that link.

Aaron said...

Compressing RAWs, great idea! There's this: Rawzor though I've never tried it.

Trixter said...

"out[i] = in[i] - 2*in[i-1] + in[i-2]"

I'm confused; on a few tests, this is performing oddly. Did you mean this instead?

out[i] = in[i] + 2*in[i-1] - in[i-2]

I ask because the simplest linear predictor I've been using was just 2*(current sample) - (previous sample) and it seems to work more predictably than what you posted.

I just want to make sure I'm not going crazy...

cbloom said...

I think it's right, but maybe we have a syntax difference.

The current sample is in[i]

A pure neighbor predictor is

pred = in[i-1]

and

out[i] = in[i] - pred

A linear gradient predictor is

pred = in[i-1] + (in[i-1] - in[i-2])

= 2*in[i-1] - in[i-2]

then

out[i] = in[i] - pred

= in[i] - (2*in[i-1] - in[i-2])

= in[i] - 2*in[i-1] + in[i-2]

Trixter said...

Thanks for the longhand; the predictor we use is identical given the differences in our syntax.

However, I still don't understand how the sign is changing from:

= in[i] - (2*in[i-1] - in[i-2])

...to:

= in[i] - 2*in[i-1] + in[i-2]

Again, I might be missing something, but it looks like you're writing A-(B-C) = A-B+C ...

cbloom said...

Indeed!

old rants