cbloom rants: Followup tidbits on RGBE

As noted previously, RGBE 8888 is not a very good encoding for HDR in 32 bits. I haven't personally evaluated the other options, but from reading the 16-8-8 LogLUV looks okay. You want more bits of precision for luminance, and the only way to do that is to go into some kind of luma-chroma space.

In any case, we'll look at a couple RGBE followup topics because I think they may be educational. Do NOT use these. This is for our education only, don't copy paste these and put them in production! If you want an RGBE conversion you can use, see the previous post!

In the previous post I wrote that I generally prefer centered quantization that does bias on encode. This is different than what is standard for Radiance HDR RGBE files. (DO NOT USE THIS). But say you wanted to do that, what would it look like exactly?


// float RGB -> U8 RGBE quantization
void float_to_rgbe_centered(unsigned char * rgbe,const float * rgbf)
{
    // NOT HDR Radiance RGBE conversion! don't use me!
        
    float maxf = rgbf[0] > rgbf[1] ? rgbf[0] : rgbf[1];
    maxf = maxf > rgbf[2] ? maxf : rgbf[2];

    if ( maxf <= 1e-32f )
    {
        // Exponent byte = 0 is a special encoding that makes RGB output = 0
        rgbe[0] = rgbe[1] = rgbe[2] = rgbe[3] = 0;
    }
    else
    {
        int exponent;
        frexpf(maxf, &exponent);
        float scale = ldexpf(1.f, -exponent + 8);
    
        // bias might push us up to 256
        // instead increase the exponent and send 128
        if ( maxf*scale >= 255.5f )
        {
            exponent++;
            scale *= 0.5f;
        }
    
        // NOT HDR Radiance RGBE conversion! don't use me!
        rgbe[0] = (unsigned char)( rgbf[0] * scale + 0.5f );
        rgbe[1] = (unsigned char)( rgbf[1] * scale + 0.5f );
        rgbe[2] = (unsigned char)( rgbf[2] * scale + 0.5f );
        rgbe[3] = (unsigned char)( exponent + 128 );
    }
}

// U8 RGBE -> float RGB dequantization
void rgbe_to_float_centered(float * rgbf,const unsigned char * rgbe)
{
    // NOT HDR Radiance RGBE conversion! don't use me!

    if ( rgbe[3] == 0 )
    {
        rgbf[0] = rgbf[1] = rgbf[2] = 0.f;
    }
    else
    {
        // NOT HDR Radiance RGBE conversion! don't use me!

        float fexp = ldexpf(1.f, (int)rgbe[3] - (128 + 8));
        // centered restoration, no bias :
        rgbf[0] = rgbe[0] * fexp;
        rgbf[1] = rgbe[1] * fexp;
        rgbf[2] = rgbe[2] * fexp;
    }
}

what's the difference in practice ?

On random floats, there is no difference. This has the same 0.39% max round trip error as the reference implementation that does bias on decode.

The difference is that on integer colors, centered quantization restores them exactly. Specifically : for all the 24-bit LDR (low dynamic range) RGB colors, the "centered" version here has zero error, perfect restoration.

That sound pretty sweet but it's not actually helpful in practice, because the way we in games use HDR data typically has the LDR range scaled in [0,1.0] not ,[0,255]. The "centered" way does preserve 0 and 1 exactly.

The other thing I thought might be fun to look at is :

The Radiance RGBE conversion has 0.39% max round trip error. That's exactly the same as a flat quantizer from the unit interval to 7 bits. (the bad conversion that did floor-floor had max error of 0.78% - just the same as a flat quantizer to 6 bits).

But our RGBE all have 8 bits. We should be able to get 8 bits of precision. How would you do that?

Well one obvious issue is that we are sending the max component with the top bit on. It's in [128,255], we always have the top bit set and then only get 7 bits of precision. We could send that more like a real floating point encoding with an implicit top bit, and use all 8 bits.

If we do that, then the decoder needs to know which component was the max to put the implicit top bit back on. So we need to signal it. Well, fortunately we have 8 bits for the exponent which is way more dynamic range than we need for HDR imaging, so we can take 2 bits from there to send the max component index and leave 6 bits for exponent.

Then we also want to make sure we use the full 8 bits for the non-maximal components. To do that we can scale their fractional size relative to max up to 255.

Go through the work and we get what I call "rgbeplus" :


/**

! NOT Radiance HDR RGBE ! DONT USE ME !

"rgbeplus" packing

still doing 8888 RGBE, one field in each 8 bits, not the best possible general 32 bit packing

how to get a full 8 bits of precision for each component
(eg. maximum error 0.19% instead of 0.38% like RGBE)

for the max component, we store an 8-bit mantissa without the implicit top bit
  (like a real floating point encoding, unlike RGBE which stores the on bit)
  (normal RGBE has the max component in 128-255 so only 7 bits of precision)

because we aren't storing the top bit we need to know which component was the max
  so the decoder can find it

we put the max component index in the E field, so we only get 6 bits for exponent
  (6 is plenty of orders of magnitude for HDR images)
  
then for the non-max fields, we need to get a full 8 bits for them too  
  in normal RGBE they waste the bit space above max, because we know they are <= max
  eg. if max component was 150 , then the other components can only be in [0,150]
    and all the values above that are wasted precision
  therefore worst case in RGBE the off-max components also only have 7 bits of precision.
  To get a full 8, we convert them to fractions of max :
  frac = not_max / max
  which we know is in [0,1]
  and then scale that up by 255 so it uses all 8 bits

this all sounds a bit complicated but it's very simple to decode

I do centered quantization (bias on encode, not on decode)

**/

// float RGB -> U8 RGBE quantization
void float_to_rgbeplus(unsigned char * rgbe,const float * rgbf)
{
    // rgbf[] should all be >= 0 , RGBE does not support signed values
    
    // ! NOT Radiance HDR RGBE ! DONT USE ME !

    // find max component :
    int maxi = 0;
    if ( rgbf[1] > rgbf[0] ) maxi = 1;
    if ( rgbf[2] > rgbf[maxi] ) maxi = 2;
    float maxf = rgbf[maxi];

    // 0x1.p-32 ?
    if ( maxf <= 1e-10 ) // power of 10! that's around 2^-32
    {
        // Exponent byte = 0 is a special encoding that makes RGB output = 0
        rgbe[0] = rgbe[1] = rgbe[2] = rgbe[3] = 0;
    }
    else
    {
        int exponent;
        frexpf(maxf, &exponent);
        float scale = ldexpf(1.f, -exponent + 9);
        // "scale" is just a power of 2 to put maxf in [256,512)
        
        // 6 bits of exponent :
        if ( exponent < -32 )
        {
            // Exponent byte = 0 is a special encoding that makes RGB output = 0
            rgbe[0] = rgbe[1] = rgbe[2] = rgbe[3] = 0;
            return;
        }
        myassert( exponent < 32 );
        
        // bias quantizer in encoder (centered restoration quantization)
        int max_scaled = (int)( maxf * scale + 0.5f );
        if ( max_scaled == 512 )
        {
            // slipped up because of the round in the quantizer
            // instead do ++ on the exp
            scale *= 0.5f;
            exponent++;
            //max_scaled = (int)( maxf * scale + 0.5f );
            //myassert( max_scaled == 256 );
            max_scaled = 256;
        }
        myassert( max_scaled >= 256 && max_scaled < 512 );
        
        // grab the 8 bits below the top bit :
        rgbe[0] = (unsigned char) max_scaled;
        
        // to scale the other two components
        //  we need to use the maxf the *decoder* will see
        float maxf_dec = max_scaled / scale;
        myassert( fabsf(maxf - maxf_dec) <= (0.5/scale) );
        
        // scale lower components to use full 255 for their fractional magnitude :
        int i1 = (maxi+1)%3;
        int i2 = (maxi+2)%3;
        rgbe[1] = u8_check( rgbf[i1] * 255.f / maxf_dec + 0.4999f );
        rgbe[2] = u8_check( rgbf[i2] * 255.f / maxf_dec + 0.4999f );
        
        // rgbf[i1] <= maxf
        // so ( rgbf[i1] * 255.f / maxf ) <= 255
        // BUT
        // warning : maxf_dec can be lower than maxf
        // maxf_dec is lower by a maximum of (0.5/scale)
        // worst case is 
        // (rgbf[i1] * 255.f / maxf_dec ) <= 255.5
        // so you can't add + 0.5 or you will go to 256
        // therefore we use the fudged bias 0.4999f
        
        rgbe[3] = (unsigned char)( ( (exponent + 32) << 2 ) + maxi );
    }
}

// U8 RGBE -> float RGB dequantization
void rgbeplus_to_float(float * rgbf,const unsigned char * rgbe)
{
    // ! NOT Radiance HDR RGBE ! DONT USE ME !

    if ( rgbe[3] == 0 )
    {
        rgbf[0] = rgbf[1] = rgbf[2] = 0.f;
    }
    else
    {
        int maxi = rgbe[3]&3;
        int exp = (rgbe[3]>>2) - 32;
        float fexp = ldexpf(1.f, exp - 9);
        float maxf = (rgbe[0] + 256) * fexp;
        float f1 = rgbe[1] * maxf / 255.f;
        float f2 = rgbe[2] * maxf / 255.f;
        int i1 = (maxi+1)%3;
        int i2 = (maxi+2)%3;
        rgbf[maxi] = maxf;
        rgbf[i1] = f1;
        rgbf[i2] = f2;
    }
}

and this in fact gets a full 8 bits of precision. The max round trip error is 0.196% , the same as a flat quantizer to 8 bits.

(max error is always measured as a percent of the max component, not of the component that has the error; any shared exponent format has 100% max error if you measure as a percentage of the component)

Again repeating myself : this is a maximum precision encoding assuming you need to stick to the "RGBE" style of using RGB color space and putting each component in its own byte. That is not the best possible way to send HDR images in 32 bits, and there's no particular reason to use that constraint.

So I don't recommend using this in practice. But I think it's educational because these kind of considerations should always be studied when designing a conversion. The errors from getting these trivial things wrong are very large compared to the errors that we spend years of research trying to save, so it's quite frustrating when they're done wrong.

3 comments:

Fabian 'ryg' Giesen said...: RGBE is for linear-space RGB data; it's not that useful to look at what the error would be on [0,1] values in sRGB space, because normally you wouldn't do that. You convert the [0,1] values to linear space and RGBE encode that.

I haven't tried to work out what the typical error would be, but it should be better than RGBE encoding of sRGB-space data in [0,1], since the linear-space encoding has a larger dynamic range so we get better use out of the exponent.; June 16, 2020 at 11:27 AM
cbloom said...: This comment has been removed by the author.; June 16, 2020 at 12:07 PM
cbloom said...: All the errors I posted on the primary post (the previous one) are for floats on all ranges. The 0.19%,0.38%,0.78% etc. are measured on random floats, not unit floats.

You can check the test code in the previous post :

Reference test code that will print these errors : test_rgbe_error.cpp

Having an exponent doesn't help with relative error, it's just a necessary requirement to not have your relative error die on large ranges.

I happen to note that the RGBE centered version has exact restoration of the LDR values in [0,255] but I agree that's sort of irrelevant because that's not how it's used in practice.

NOT RELEVANT FOR ACTUAL RADIANCE HDR RGBE USE!

This entire "followup tidbits" post is NOT things that are relevant to Radiance HDR files or how they are used or encoded.

Sorry for yelling but I have so many people that copy paste code even when I put

// do not copy-paste this!

I'm not sure how to make it clearer :(; June 16, 2020 at 1:07 PM

cbloom rants

6/09/2020

Followup tidbits on RGBE

3 comments:

old rants