# cbloom rants

## 11/18/2008

### 11-18-08 - DXTC

I've been doing a little work on DXT1 compression. The main reason I'm looking at it is because I want to do something *better* than DXT1, and before I do that, I want to make sure I'm doing the DXTC right.

For now let's gather prior art :

The only real decision the coder gets to make is what the two 565 endpoints are. The other details you just have to get right - reconstruct the palette interpolants right, and assign the 16 indices to the closest palette entries right. For the moment I don't care about speed, I just want to make sure the indices are actually the best choices.

All the papers by Ignacio and J.M.P. van Waveren are modern classics. One thing of note : the "simplified" index finder that JMP has in his original paper is wrong. On page 12 he says "Evaluating the above expression reveals that the sub expression ( !b3 & b4 ) can be omitted because it does not significantly contribute to the final result" - apparently that is not true, because when I use the "rewritten" index finder it produces bad indexes. His original version of the index finder bit twiddle is fine.

One thing I've found with the index finding is that the integer and nonlinear effects are pretty strong and any ideas you have about planar geometry fail if you assume that your 4 points are colinear (they are not actually). For example, an obvious idea is to do a dot product test to pick between the 4 points. First dot product with the middle separating plane, then the two planes between the next pairs of points. This doesn't work. Part of the problem is because of the bad rounding in the generation of the 1/3 point. The JMP method is actually comparing vs the true distances, so that part is good.

In more detail, an obvious idea is to do something like this :

int dot = (color - c0) * (c1 - c0);

int bmid = (dot > mid01);
int b02 = (dot > mid02);
int b31 = (dot > mid31);

// using the DXT1 palette order label 0,2,3,1

int threebit = (bmid<<2) + (b02<<1) + b31;

int index = remap[threebit];

// remap is an 8-entry table that remaps from 3 bits to 2 bits to give an index

That "works" just as much as the "ryg" code works or the "Extreme DXT" method of finding indexes work - which is to say that it doesn't work.

FastDXT appears to be an implementation of the id "Real Time DXT" paper.

Ericsson Texture Compression (ETC2) is similar to DXTC but different; this is a pretty good paper, there are some interesting ideas here like the T and H blocks. It gets slightly better quality in some cases. It's obvious that you can beat DXTC by having a base color and a log-scaled delta color, rather than two end points. The two 565 end points is wasteful; you could for example do a 777 base color and a 444 log scaled delta.

Tight Frame Normal Map Compression is similar to DXT5 (really to 3Dc) with some small improvement on normal maps.

Extreme DXT Compression by Peter Uliciansky is indeed super fast. However, it has some errors. The division method that he uses to assign the indeces is not correct. You can improve it by offsetting the end points correctly to put the division quantization boundaries in the right place, but even then it's still not exactly correct unless the diagonal of the color bbox is along {1,1,1}. The error from this is pretty small and he's going for max speed, so it's semi reasonable what he does. Note that using the full range of the bbox for the division but insetting it to make the dxt1 colors (as he does) improves the indeces slightly, but it's not actually the correct range scale.

In more detail :

To find indices for the DXTC colors c0 and c1

You should use

base = c0 + (c0 - c1)/6

range = (c1 - c0) * 4/3

index = (color - base) * 4 / range;

or

index = (color - base) * 3 / (c1 - c0);

not :

index = (color - c0) * 4 / (c1 - c0);

But even if you do that it's still wrong because the planes through color space do not actually separate the colors from their closest palette entry.

MSDN compressed format documentation now has the right rounding for the 1/3 points, but the old DirectX8 docs have the wrong rounding. The old docs said

color_2 = (2*color_0 + 1*color_1 + 1)/3
color_3 = (1*color_0 + 2*color_1 + 1)/3

Which is actually better, but is not what the standard or the hardware does.

Simon Brown's Squish is very good. His Cluster Fit idea is elegant and clever and produces quite good quality.

There's a thread on MollyRocket about DXTC with some convenient code from "ryg". It's nice simple code, but it has some errors. The way he finds the indices is the dot product method which is wrong. Also the way he computes the 1/3 and 2/3 colors for the palette using LerpRGB is wrong, it rounds differently that the hardware really does.

One thing that some simple code gets wrong is the way to quantize from 24 bit to 565. You can't just take the top bits, because the 565 values are not reconstructed to the middle of their bins. You need to correct for the offsetting. You can either use the true range, which is something like -4 to 259, or you can skew the bits in the quantizer. The "ryg" code has a perfect quantizer that's very neat in the "As16Bit" function. An example is that the value 250 should map to 30 in 5 bits, not 31.

BTW the YCoCg-DXT5 method is pretty good on current hardware, but I'm not really looking at it. It's obviously inherently wasteful, it's just a way of making use of the current hardware, but it's not really what you'd want to be doing. It's also very large, at around 2.6 bits per byte (8 bits per pixel).

castano said...

The index evaluation method used by Peter Uliciansky is also used in our latest GPU implementation:

http://developer.nvidia.com/object/real-time-normal-map-dxt-compression.html

it was first proposed by JP in the Real-Time DXT Compression paper:

http://cache-www.intel.com/cd/00/00/32/43/324337_324337.pdf

and it was first implemented by the guys from Allegorithmic. I did not realize the bug in his implementation, I think ours is correct, right?

cbloom said...

Yeah, this :

const int GREEN_RANGE = 3;

float bias = maxGreen + (maxGreen - minGreen) / (2.0 * GREEN_RANGE);
float scale = 1.0f / (maxGreen - minGreen);

// Compute indices
uint indices = 0;
for (int i = 0; i < 16; i++)
{
uint index = saturate((bias - block[i].y) * scale) * GREEN_RANGE;
indices |= index << (i * 2);
}

Appears to be correct. You've got the bias and the *3 instead of *4.

cbloom said...

Though even getting the scale & bias right it's still not going to always pick the actual closest palette entry in the DXT1 case because of the integer truncation issues. The error will be small, however, so not really relevant for you guys and the real-time case.