01-14-10 - A small note on Trellis quantization

See reference .

I guess this is obvious, but you do get a pretty nice win from using the true floating point Dct results rather than the quantized Dct results when you do TQ.

I believe the standard practice (what I was doing before anyway) is to do your normal fast Dct + quantization, which takes your integer pixels and makes quantized integer post-Dct output. You then apply TQ on that quantized-and-transformed output, which means instead of sending the true output { 4,3,0,0 } you also consider {4,2,0,0 } and {4,0,0,0 } and so on, measure J(R,D) of each and pick the best.

Okay, but the distortion of changing "1" to 0 is not the same as the distortion of another 1 to 0 , if those were not really the same 1 before quantization.

For example, say you're quantizing with a quantizer = 1.0 for simplicity, and no deadzone, even bucket sizes, so you have quantization buckets :

[ -0.5, 0.5 ] -> 0
[ 0.5 , 1.5 ] -> 1

In that case, when TQ decides to take a quantized "1" and send instead a 0, if the true value was 0.51 , that's not so bad. If the true original value was 1.49 , that's a lot worse.

However, interestingly, if the true original value was 1.49 , then we could send it as a "2". If the value is near a quantization boundary, then the distortion doesn't care whether you kick the value up or down, but the rate might be significantly different, in which case you should make a choice based on J and get a win.

So my Dct now also does the true float -> float transform just for use in the distortion measurement for TQ. It's also useful in this application to make sure your Dct doesn't do any scaling, so that the transform is Unitary, that is, L2 norm is preserved. That way the distortion measure in post-Dct space is the same as the distortion measure in pre-Dct space which means you can use the same lambda for lagrange J decisions.

No comments:

old rants