Comments on cbloom rants: 04-05-12 - DXT is not enough - Part 2

The main thrust of the linked pdf seems to be the ...

2012-04-25T12:55:11.230-07:00

The main thrust of the linked pdf seems to be the idea of encoding one endpoint explicitly and one as delta from it, allowing more precision when the endpoints are near each other.

The above is one of the features of the BC6H texture format introduced in DX 11. So the paper seems to be of little interest now.

Can you comment on this Texture compressor, that r...

2012-04-20T16:44:18.748-07:00

Can you comment on this Texture compressor, that reuses some DXT hardware in a clever way.

http://pholia.tdi.informatik.uni-frankfurt.de/~philipp/publications/ftcpaper-4.p.pdf

Yeah, I should clarify a few points : 1. Obviousl...

2012-04-10T09:57:57.820-07:00

Yeah, I should clarify a few points :

1. Obviously the most interesting possibility for texel shader is some combination of decoding, compositing, baking lighting, procedural texture generation, etc. Just decompressing is not that compelling.

2. The point of this thought exercise is not to say necessarily "we should have texel shaders" but to see why DXTC is in this preferred position at the moment, and what exactly would we have to do to remove it.

So, @ryg :

My imagination was that you would batch up work somehow. When vertex/pixel shaders try to fetch a texel, if it's not there they get their stack pushed waiting on that result; once you get a bunch of texel requests you run a batch of them.

It's not that different than the normal vertex shading results cache. In the end it's just dynamic programming, storing computed results and running a shader to fill the slots that are needed. Granted the cache scheme needs to be more complex than the current vertex cache.

Even just allowing it adds significant complicatio...

2012-04-09T21:08:28.584-07:00

Even just allowing it adds significant complications to the hardware: cache stuff is kinda timing sensitive wrt to sizing stuff correctly, and having completely unpredictable latency in the middle makes it hard. Also if you dispatch to variable-latency shader code you won't get around having multiple outstanding misses at the same time with hit-under-miss processing; it gets pretty gnarly fast.

It's especially awkward because GPU shader cores really have horrible latencies for everything, and really need work in large batches to compensate for it. E.g. on Fermi, a single ALU op has about 20 cycles latency. Decoding a texture L1/L2 cache line's worth of data is a shitty granularity to be spinning up a shader core for, and when you do, it's gonna be several hundreds or even thousands of cycles to complete even a very simpler "texture fetch shader" - as if you had missed L2 and did the fetch from memory. It just doesn't fit well in the current architectures.

Fundamentally, caches just need a very different design when you have to assume that a double-digit percentage of your cache lines is allocated-but-outstanding for a sustained period.

I like the idea of exposing that part of the pipel...

2012-04-09T20:36:25.540-07:00

I like the idea of exposing that part of the pipeline. Give developers a way to get in there and let them be responsible if they slow their application down for the sake of some custom decompression etc. Hopefully we'll see this soon!

The idea of a texel shader has been kicked around ...

2012-04-05T21:31:18.388-07:00

The idea of a texel shader has been kicked around ever since pixel shaders were added, but mainly for clever compositing shaders (multiple layers of detail maps, using heightfield normal maps and turning them into vector ones on cache miss, etc). The problem with the DXT path is it still needs to be really fast, and that usually means custom HW with lots of bit-twiddling and tiny palettes. It's hard to make those operations programmable without hurting perf, or without just adding a bunch of area that nothing else on the chip uses. If you can figure out a decode that can be done with small changes to the standard shader pipeline, then there's an interesting path forwards.