1. Simple simple simple. The decoder should be implementable in ~5000 lines as a single file stb.h style header. Keep it simple!
2. It should be losslessly transcodable from JPEG , ala packJPG/Lepton. That is, JPEG1 should be contained as a subset. (this just means having 8x8 DCT mode, quantization matrix). You could have other block modes in JPEG2 that simply aren't used when you transcode JPEG. You replace the entropy coded back-end with JPEG2 and should get about 20% file size reduction.
IMO this is crucial for rolling out a new format, nobody should ever be trancoding existing JPEGs and thereby introducing new error.
3. Reasonably fast to decode. Slower than JPEG1 by maybe 2X is okay, but not by 10X. eg. JPEG-ANS is okay, JPEG-Ari is probably not okay. Also think about parallelism and GPU decoding for huge images (100 MP). Keeping decoding local is important (eg. each 32x32 block or so should be independently decodable).
4. Decent quality encoding without crazy optimizing encoders. The straightforward encode without big R-D optimizing searches should still beat JPEG.
5. Support for per-block Q , so that sophisticated encoders can do bit rate allocation.
6. Support alpha, HDR. Make a clean definition of color space and gamma. But *don't* go crazy with supporting ICC profiles and lots of bit depths and so on. Needs to be the smallest set of features here. You don't want to get into the situation that's so common where the format is too complex and nobody actually supports it right in practice, so there becomes a "spec standard" and a "de-facto standard" that don't parse lots of the optional modes correctly.
7. Support larger blocks & non-square blocks; certainly 16x16 , maybe 32x32 ? Things like 16x8 , etc. This is important for increasingly large images.
Most of all keep it simple, keep it close to JPEG, because JPEG actually works and basically everything else in lossy image compression doesn't.
Anything that's not just DCT + quantize + entropy is IMO a big mistake, very suspicious and likely to be vaporware in the sense that you can make it look good on paper but it won't work well in reality.
ADD :
I have in the past posted many times about how plain old baseline JPEG + decent back-end entropy (eg. packJPG/Lepton) is surprisingly competitive with almost every modern image codec.
That's actually quite surprising.
The issue is that baseline JPEG is doing *zero* R-D optimization. Even if you use something like mozjpeg which is doing a bit of R-D optimization, it's doing it for the *wrong* rate model (assuming baseline JPEG coding, not the packjpg I then actually use).
It's well known that doing R-D optimization correctly (with the right rate model) provides absolutely enormous wins in lossy compression, so the fact that baseline JPEG + packJPG without any R-D at all can perform so well is really an indictment of everything it beats. This tells us there is a lot of room for easy improvement.
Absolutely that's the way to go.
ReplyDeleteI'd add: support progressive rendering or hierarchical encoding, so that the full image doesn't have to be decoded to display just a thumbnail.
What do you think about Jpeg XL? It has all the requirements...except the complexity.
ReplyDeleteIn terms of quality, JPEG XL is the first standard that actually seems to provide a significant quality improvement over JPEG.
ReplyDelete(eg. webp, JPEG-XR, JPEG-2000, all failed; H265 I frames are great but not usable due to patent/license issues).
OTOH there are lots of problems with JPEG XL, the first being severe over-complexity. It seems like they took Pik and lots of other work and crammed it all together and made a real mess.
There's still a huge empty space for a simple and definite real step up from JPEG.
You say that H265 frames are great, so what is the difference to AV1 or AVIF?
ReplyDeleteAVIF and HEIC are almost the same thing, except the codec: AV1 vs H.265. They have very similar quality/filesize ratio, but AV1 has a much better patent licensing story.
ReplyDeleteAVIF has shipped in Chrome (and is a work in progress in Firefox). Real-world deployment is a big advantage here. JPEG XL is much faster to encode and looks promising.
I'll see if I can do a test of AVIF...
ReplyDeleteAnybody know what encoder & settings are best? I see things like "avifenc" on the web but there's absolutely zero guidance on recommended settings.
If you post a blog showing comparisons between encoders, you need to post what encoders you use and what settings so that others can repro exactly. If you don't provide repro instructions it's not science.