04-09-12 - Old Image Comparison Post Gathering

Perceptual Metrics, imdiff, and such. Don't think I ever did an "index post" so here it is :

01-18-11 - Hadamard
01-17-11 - ImDiff Release
01-12-11 - ImDiff Sample Run and JXR test
01-10-11 - Perceptual Results - PDI
01-10-11 - Perceptual Results - mysoup
01-10-11 - Perceptual Results - Moses
01-10-11 - Perceptual Metrics
01-10-11 - Perceptual Metrics Warmup - x264 Settin...
01-10-11 - Perceptual Metrics Warmup - JPEG Settin...
12-11-10 - Perceptual Notes of the Day
12-09-10 - Rank Lookup Error
12-09-10 - Perceptual vs TID
12-06-10 - More Perceptual Notes
12-02-10 - Perceptual Metric Rambles of the Day
11-18-10 - Bleh and TID2008
11-16-10 - A review of some perceptual metrics
11-08-10 - 709 vs 601
11-05-10 - Brief note on Perceptual Metric Mistakes
10-30-10 - Detail Preservation in Images
10-27-10 - Image Comparison - JPEG-XR
10-26-10 - Image Comparison - Hipix vs PDI
10-22-10 - Some notes on Chroma Sampling
10-18-10 - How to make a Perceptual Database
10-16-10 - Image Comparison Part 9 - Kakadu JPEG2000
10-16-10 - Image Comparison Part 11 - Some Notes on the Tests
10-16-10 - Image Comparison Part 10 - x264 Retry
10-15-10 - Image Comparison Part 8 - Hipix
10-15-10 - Image Comparison Part 7 - WebP
10-15-10 - Image Comparison Part 6 - cbwave
10-14-10 - Image Comparison Part 5 - RAD VideoTest
10-14-10 - Image Comparison Part 4 - JPEG vs NewDCT
10-14-10 - Image Comparison Part 3 - JPEG vs AIC
10-14-10 - Image Comparison Part 2
10-12-10 - Image Comparison Part 1


Anonymous said...

You never really drew any final conclusions publically. Was there a bottom line?

I wish the final perceptual results had including jpeg-huffman as well; while I understand your goal was to compare new ones to the "state-of-the-art" jpeg, I'm curious to also know which ones even beat plain-jane jpeg, and by how much (given that some of us continue to use it, since we have easy-to-use free decoders and such).

cbloom said...

But there *are* jpeg_h results in the final runs :

In fact, see the last post with results :

ImDiff Sample Run and JXR test


Anonymous said...

Well, ok, yes, but only on Lena, not on the three previous "Perceptual Results" posts.

cbloom said...

Yeah, fair enough, I never did the "here's a bunch of metrics for a bunch of compressors on a bunch of images".

The problem is it's information overload; any one metric only shows you part of the picture, you really want to look at 4 metrics to see what's going on.

Anonymous said...

Yeah, it is a hard problem, I realize.

Your "axes" are: images, metrics, compressors, and compression level.

The last one is the only one that's continuous, so you tended to make that be the x axis, but maybe that's not ideal.

Also, you evaluated the metrics for their correlation to perceived quality, so you could just pick the top metric and stick to that.

So you could make, say, a bar graph that is clusters of bars; each cluster is an image, each bar is a particular metric for that image, the height of the bar is the quality under that metric. Make three bar graphs, one for log bpp -1, one for log bpp 0, and one for log bpp 1.

I dunno, that's probably still terrible too.

Anonymous said...

Whoops, I typo'd "metric" several times where I meant "compressor", since as I said, this is assuming just use the top metric.

cbloom said...

Nah, I don't just mean it's "information overload" = "it's a hard problem".

What I mean is there's a certain amount of expert analysis needed to make sense of the numbers. Hopefully somebody who read the whole series of posts picks up a bit of the flavor of how I look at the results and learns to do it a bit themselves.

If I just put up a bunch of summary numbers, that tempts people to just go and look at those numbers without reading the more detailed analysis.

I believe it's one of those cases where putting simple numbers on things actually gives you worse information.

It's sort of like how a CIA analyst only passes on their expert summary, not the original source information, because people who are untrained in interpreting the source information can make foolish conclusions from it (like thinking Iraq was somehow involved in 9/11 LOL); it's a case where the individual facts can actually be misleading unless you keep the whole picture in mind.

It's sort of like giving schools or teachers a single numerical score; you've actually greatly *decreased* the depth of analysis by doing that.

The thing about the perceptual metrics is they can all be fooled in certain ways; you have to look at the behavior under a few metrics, and at many bit rates to get a holistic view of the nature of the coder.

Maybe I will write a summary because I have a few more things to say.

old rants