RMSE of fit vs. observed MOS data :
RMSE_RGB : 1.052392 SCIELAB_RMSE : 0.677143 SCIELAB_MyDelta : 0.658017 MS_SSIM_Y : 0.608917 MS_SSIM_IW_Y : 0.555934 PSNRHVSM_Y : 0.521825 PSNRHVST_Y : 0.500940 PSNRHVST_YUV : 0.480360 MyDctDelta_Y : 0.476927 MyDctDelta_YUV : 0.444007
BTW I don't actually use the raw RMSE as posted above. I bias by the sdev of the observed MOS data - that is, smaller sdev = you care about those points more. See previous blog posts on this issue. The sdev biased scores (which is what was posted in previous blog posts) are :
RMSE_RGB : 1.165620 SCIELAB_RMSE : 0.738835 SCIELAB_MyDelta : 0.720852 MS_SSIM_Y : 0.639153 MS_SSIM_IW_Y : 0.563823 PSNRHVSM_Y : 0.551926 PSNRHVST_Y : 0.528873 PSNRHVST_YUV : 0.515720 MyDctDelta_Y : 0.490206 MyDctDelta_YUV : 0.458081 Combo : 0.436670 (*)
(* = ADDENDUM : I added "Combo" which is the best linear combo of SCIELAB_MyDelta + MS_SSIM_IW_Y + MyDctDelta_YUV ; it's a static linear combo, obviously you could do better by going all Netflix-Prize-style and treating each metric as an "expert" and doing weighted experts based on various decision attributes of the image; eg. certain metrics will do better on certain types of images so you weight them from that).
For sanity check I made plots (click for hi res) ; the X axis is the human observed MOS score, the Y axis is the fitted metric :
Sanity is confirmed. (the RMSE_RGB plot has those horizontal lines because one of the distortion types is RGB random noise at a few fixed RMSE levels - you can see that for the same amount of RGB RMSE noise there are a variety of human MOS scores).
ADDENDUM : if you haven't followed old posts, this is on the TID2008 database (without "exotics"). I really need to find another database to cross-check to make sure I haven't over-trained.
Some quick notes of what worked and what didn't work.
What worked : Variance Masking of high-frequency detail Variance Masking of DC deltas PSNRHVS JPEG-style visibility thresholds Using the right spatial scale for each piece of the metric (eg. what size window for local sdev, what spatial filter for DC delta) Space-frequency subband energy preservation Frequency subband weighting What didn't work : Luma Masking LAB or other color spaces than YUV in most metrics anything but "Y" as the most important part of the metric Nonlinear mappings of signal and perception (other than the nonlinear mapping already in gamma correction)