11-18-09 - Raw Conversion

CR2 format is a big mess of complication. My god I hate TIFF. The main thing is the sensor data. It appears to be stored as "lossless JPEG" which is a new format that uses the JPEG-LS predictor but then just codes the residual with normal JPEG Huffman coding. The sensor data is RGGB which they either store as a 4-channel per pixel [RGGB per pixel] or as 2-channel [GR or GB]. Either way is clearly not optimal. One interesting thing I could do if I cracked the CR2 format is store all these raws smaller with a better compressor. The RAWs from the S90 are around 11M on average, it uses the 2-channel mode; the RAWs are 1872x2784 = 3744x2784 samples and 12 bits per sample. That means the JPEG is getting to 8.85 bits per sample. Not very good.

Of course I probably have to use dcraw to read it for me, but dcraw is just about the worst piece of code I've ever seen in my life. It's a miracle to me that people are able to write functioning software from code like that.

Paul Lee has a modified dcraw and some nice sample pictures of how demosaicing can go wrong (click the Moire or Aliasing links).

My idea for high quality RAW processing :

First of all, abandon your idea of an "image" as a matrix (grid) of colors (aka a bitmap).

The S90 sensor has barrel distortion that's corrected in software.

It also samples colors in an RGGB Bayer mosaic pattern (like most cameras).

The two of those things combined mean that you really just have a collection of independent R's, G's, and B's at
irregular positions (not on a grid due to barrel distortion).

Now, you should also know that you need to do things like denoising on these original samples, NOT on
the grid of colors after conversion to a bitmap.

So I want to denoise directly on the source data of irregular color samples.
Denoising R & B should make use of the higher quality G data.

Denoising should of course use edge detection and other models of the image prior to make a Bayesian
maximum likelihood estimate of the sample without noise.

To output a bitmap you need to sample from this irregular lattice of samples (mosaic'ed and distorted).

Resampling creates aliasing and loss of information, so you only want to do it once ever on an image.

There's absolutely no a-priori reason why we should be resampling to the same resolution as the sensor
here.  You should resample at this point directly to the final resolution that you want your image.

For example with the S90 rather than outputting the stupid resolution 3648x2736, I would just output 3200x2400
which would let me view images at 1600x1200 on monitors with a box down-filter which will make them appear
much higher quality in practice (vs 3648x2736 viewed at 1600x1200 which involves a nasty blurring down-filter).

The output from this should be a floating point bitmap so that we don't throw away any color resolution

Exposure correction can then be done on the floating point bitmap without worrying about the irregular
lattice or any further resampling issues.


won3d said...

FWIW, I think your ideas are great. Is it me or do these seem rather straightforward? In particular, the bit about resampling to sensor resolution is probably because it is hard to market sensor resolution separate from image resolution.

The whole geometric distortion thing makes it totally obvious. If there is barrel distortion, it will be different for the different color bands because of chromatic aberration. Now, there are time-worn techniques to deal with chromatic aberration, but these end up with more complex, heavier, more expensive lenses. Screw that. The optics should really only optimize for having reasonable focus; geometric and chromatic correction should just be done in software.

cbloom said...

"The optics should really only optimize for having reasonable focus; geometric and chromatic correction should just be done in software."

Yeah I think the S90 and the LX3 show that the camera makers agree. I think we'll see even more lenses in the future that just punt on distortion and chromatic aberration and let the software fix it.

I don't have a problem with that at all, if it lets them make smaller lenses that let in more light (like the S90 and LX3) then I say go for it. I just pray that they let us get at the raw sensor data to fix it up ourselves rather than having firmware do it, which I fear is the way of the future.

won3d said...

There's this:


I like the Stanford computational photography stuff, but I guess it is a bit unclear what kind of impact they will have. I haven't heard much out of Refocus Imaging, for example.

So I have all the faith in the world that software correction is the way to go, but I wonder if this technique extends to exchangeable lens systems. One of things about the new micro 4/3rds format is that they are doing corrections in software, but how do they correct for different lenses? Is it somehow encoded in the lens itself (maybe as Zernike polynomials)? Or is there some standard distortion profile that happens to be easier to design for?

But maybe I should stop talking now, since I don't actually know much about actual photography, even though I did some stuff with optics. Meh, it wouldn't be the internet if people didn't speak authoritatively about things they didn't understand.

old rants