cbloom rants: 04-09-12 - Old Image Comparison Post Gathering

4/09/2012

04-09-12 - Old Image Comparison Post Gathering

Perceptual Metrics, imdiff, and such. Don't think I ever did an "index post" so here it is :

7 comments:

Anonymous said...: You never really drew any final conclusions publically. Was there a bottom line?

I wish the final perceptual results had including jpeg-huffman as well; while I understand your goal was to compare new ones to the "state-of-the-art" jpeg, I'm curious to also know which ones even beat plain-jane jpeg, and by how much (given that some of us continue to use it, since we have easy-to-use free decoders and such).; April 10, 2012 at 4:19 PM
cbloom said...: But there *are* jpeg_h results in the final runs :

In fact, see the last post with results :

ImDiff Sample Run and JXR test

http://cbloomrants.blogspot.com/2011/01/01-12-11-imdiff-sample-run-and-jxr-test.html; April 11, 2012 at 10:16 AM
Anonymous said...: Well, ok, yes, but only on Lena, not on the three previous "Perceptual Results" posts.; April 11, 2012 at 7:45 PM
cbloom said...: Yeah, fair enough, I never did the "here's a bunch of metrics for a bunch of compressors on a bunch of images".

The problem is it's information overload; any one metric only shows you part of the picture, you really want to look at 4 metrics to see what's going on.; April 12, 2012 at 9:40 AM
Anonymous said...: Yeah, it is a hard problem, I realize.

Your "axes" are: images, metrics, compressors, and compression level.

The last one is the only one that's continuous, so you tended to make that be the x axis, but maybe that's not ideal.

Also, you evaluated the metrics for their correlation to perceived quality, so you could just pick the top metric and stick to that.

So you could make, say, a bar graph that is clusters of bars; each cluster is an image, each bar is a particular metric for that image, the height of the bar is the quality under that metric. Make three bar graphs, one for log bpp -1, one for log bpp 0, and one for log bpp 1.

I dunno, that's probably still terrible too.; April 13, 2012 at 6:22 PM
Anonymous said...: Whoops, I typo'd "metric" several times where I meant "compressor", since as I said, this is assuming just use the top metric.; April 13, 2012 at 6:23 PM
cbloom said...: Nah, I don't just mean it's "information overload" = "it's a hard problem".

What I mean is there's a certain amount of expert analysis needed to make sense of the numbers. Hopefully somebody who read the whole series of posts picks up a bit of the flavor of how I look at the results and learns to do it a bit themselves.

If I just put up a bunch of summary numbers, that tempts people to just go and look at those numbers without reading the more detailed analysis.

I believe it's one of those cases where putting simple numbers on things actually gives you worse information.

It's sort of like how a CIA analyst only passes on their expert summary, not the original source information, because people who are untrained in interpreting the source information can make foolish conclusions from it (like thinking Iraq was somehow involved in 9/11 LOL); it's a case where the individual facts can actually be misleading unless you keep the whole picture in mind.

It's sort of like giving schools or teachers a single numerical score; you've actually greatly *decreased* the depth of analysis by doing that.

The thing about the perceptual metrics is they can all be fooled in certain ways; you have to look at the behavior under a few metrics, and at many bit rates to get a holistic view of the nature of the coder.

Maybe I will write a summary because I have a few more things to say.; April 14, 2012 at 9:31 AM

cbloom rants

4/09/2012

04-09-12 - Old Image Comparison Post Gathering

7 comments:

old rants