Using normal JPEG code streams, but trying to make the encoder & decoder as good as possible, you should do something like :
Encoder :
- RDO based structure; eg. encoder is given lambda and finds optimal R/D point. Unfortunately this has to be iterative
because of huffman codes, decisions in one pass affect the huffman codes for the next pass.
- A good perceptual metric to target. Maybe SSIM or x264's funny SATD activity thing, or something else.
- Trellis quantization; the JPEG-huff code block structure lends itself to trellis state optimization pretty directly.
- Better chroma subsample (aware of the up-filter).
- Quant matrix optimization for perceptual metric.
Decoder :
- Deblocking filter, or maybe the "Unblock" histogram non-filter approach or
some combination.
- Luma-aided chroma upsample
- Expectation-in-bucket instead of mean-in-bucket dequantization.
- Noise reinjection , perhaps predicting where some of the zeros in the DCT should in fact be small non-zeros.
- Shape-aware deringing ; similar to camera denoisers, there's a lot of work on this in the literature.
1 comment:
Encoder-wise, there's some (fairly old) papers that at least implement basic RDO and iterative optimization, e.g. this: http://www.ece.umassd.edu/FACULTY/acosta/ICASSP/Icassp_1995/pdf/ic952331.pdf (There's also http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4379276 which is considerably more recent but behind a pay wall; the abstract suspiciously sounds like the authors did exactly the same stuff as in the 1995 paper without being aware of it).
JPEG is different from most audio/video formats in that the vast majority of JPEG-supporting apps use the same encoder/decoder (the IJG JPEG lib). Getting a half-decent implementation of one feature into the IJG lib is orders of magnitude more valuable than an excellent implementation of the same feature in a new library (or standalone encoder/decoder with source).
Post a Comment