tag:blogger.com,1999:blog-5246987755651065286.post8012903993970516076..comments2024-02-22T16:15:42.388-08:00Comments on cbloom rants: 11-21-08 - More Texture Compression Nonsensecbloomhttp://www.blogger.com/profile/10714564834899413045noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-5246987755651065286.post-52332706187752866522008-11-24T22:28:00.000-08:002008-11-24T22:28:00.000-08:00"Hmm... in my experience PCIe bandwidth is general..."Hmm... in my experience PCIe bandwidth is generally not the bottleneck these days."<BR/><BR/>Yeah, I was talking about the speed of actually drawing from the texture. Last time I looked it was a lot faster to read filtered texels from DXTC because of the better cache usage. Is that no true any more? Can I use uncompressed textures at full speed?<BR/><BR/>"You have much more chances of being limited by <BR/>bandwidth of the permanent storage device."<BR/><BR/>Obviously HD is slow, but my argument is that for *paging* the seek time dominates throughput. That's not true for *streaming* (eg. Bink3d) or for the initial level load of course.<BR/><BR/>"Besides that, for some developers it's not just matter of bandwidth, but of cutting costs by reducing the number of DVDs required to distribute the game."<BR/><BR/>Yeah absolutely. Though that might only be an issue for one developer ;)cbloomhttps://www.blogger.com/profile/10714564834899413045noreply@blogger.comtag:blogger.com,1999:blog-5246987755651065286.post-10765420094972612682008-11-24T22:09:00.000-08:002008-11-24T22:09:00.000-08:00Hmm... in my experience PCIe bandwidth is generall...Hmm... in my experience PCIe bandwidth is generally not the bottleneck these days. You have much more chances of being limited by bandwidth of the permanent storage device. Besides that, for some developers it's not just matter of bandwidth, but of cutting costs by reducing the number of DVDs required to distribute the game.castanohttps://www.blogger.com/profile/08088335278984724562noreply@blogger.comtag:blogger.com,1999:blog-5246987755651065286.post-71404154503524407802008-11-23T07:18:00.000-08:002008-11-23T07:18:00.000-08:00Our texture management experience has shown that i...Our texture management experience has shown that if you have enough memory, textures go into the Windows disk cache anyway, and the reads are not a problem; however, the uploads to the surfaces the GPU can render from are very slow, and remain a significant problem (with managed pool textures being a bit better than default pool). So for our particular case, the highly compressed 1-bpp version as a replacement of the disk cache will not improve the user experience until the number of textures "on standby" exceeds the available physical memory.Assenhttps://www.blogger.com/profile/04028406577283437901noreply@blogger.comtag:blogger.com,1999:blog-5246987755651065286.post-30278354568430783752008-11-21T18:16:00.000-08:002008-11-21T18:16:00.000-08:00Hey this imaginary compression format game is fun ...Hey this imaginary compression format game is fun ;)<BR/><BR/>You could have just one base color and then two deltas to define edges, so that you have a rhombus in color space, and then send 2 indexes as U & V in that polygon. Sort of like the "tight frame" thing.<BR/><BR/>Or send 4 colors to define two edges, and then send a "t" index to iterpolate along each edge and an "s" index to interpolate between the two edges. This lets you do curved paths in color space, like the way you make a curve by putting strings on the coordinate axes in elementary school. Even if your end points are just 565 you should be able to hit colors very exactly because you have so many options of how to place your end points.cbloomhttps://www.blogger.com/profile/10714564834899413045noreply@blogger.comtag:blogger.com,1999:blog-5246987755651065286.post-14155802803512880232008-11-21T18:05:00.000-08:002008-11-21T18:05:00.000-08:00Poking around online searching for public informat...Poking around online searching for public information on BC7 I see that the basic theory is that they use 2 or 3 lines instead of 1, thus allowing you to better handle regions with complex transitions.<BR/><BR/>If you get 8 bpp instead of 4bpp, then one thing you can do is just store two DX1 blocks, so you can encode two lines and two indices along those lines. You don't need to store two indices, just a choice of which line, so that gives you back 1bpp, so you end up with 16 more bits per pixel to spend on upgrading the quality of the endpoints, or to allow twice as many points along each line.<BR/><BR/>I'd guess allowing twice as many points is probably better, but maybe more expensive in hardware (and makes it harder to do optimal encoding) so you'd be better off improving the ending precision.<BR/><BR/>After poking around with the numbers, I guess I'd do something like store one end point as 777, the other as a signed 666 relative to it, and you have one bit left over that you use to indicate that you want to right-shift both of them by 1, to give better precision in dark areas. Or maybe 676 and 2 shift bits for one end, and 565 and 3 shift bits for the other end (so you can better express small steps even in-nondark areas). Either one adds up to 40 bits for a pair of endpoints, 80 bits for two pairs plus 3 bits per pixel to choose a line and an point along it, giving 128 bits.<BR/><BR/>Actually, to go back to the original, you could just do two independent lines, pick an independent point on each one from each of them, and average or sum those. That gives you more degrees of freedom per pixel; if you do something like sum them, it lets you better handle data where one channel is totally special (e.g. like YCoCg, or if somebody's encoded something independent in one channel), etc. Probably super-painful to try to optimally generate.Sean Barretthttps://www.blogger.com/profile/14465498859800664552noreply@blogger.com