Say you want to take the integers {-inf,inf} and map them to just the non-negatives {0,1,..inf}. (and/or vice-versa)
(this is common for example when you want to send a signed value using a variable length code, like unary or golomb or whatever; yes yes there are other ways, for now assume you want to do this).
We need to generate a number line with the negatives "folded up" and interleaved with the positives, like
0,-1,1,-2,2,...The naive way is :
// fold_up makes positives even // and negatives odd unsigned fold_up_negatives(int i) { if ( i >= 0 ) return i+i; else return (unsigned)(-i-i-1); } int unfold_negatives(unsigned i) { if ( i&1 ) return - (int)((i+1)>>1); else return (i>>1); }Now we want to do it branchless.
Let's start with folding up. What we want to achieve mathematically is :
fold_up_i = 2*abs(i) - 1 if i is negativeTo do this we will use some tricks on 2's complement integers.
The first is getting the sign. Assuming 32-bit integers for now, we can use :
minus_one_if_i_is_negative = (i >> 31); = 0 if i >= 0 = -1 if i < 0which works by taking the sign bit and replicating it down. (make sure to use signed right shift, and yes this is probably undefined blah blah gtfo etc).
The other trick is to use the way a negative is made in 2's complement.
(x > 0) -x = (x^-1) + 1 or -x = (x-1)^(-1)and of course x^-1 is the same as (~x), that is flip all the bits. This also gives us :
x^-1 = -x -1
And it leads obviously to a branchless abs :
minus_one_if_i_is_negative = (i >> 31); abs_of_i = (i ^ minus_one_if_i_is_negative) - minus_one_if_i_is_negative; since if i is negative this is -x = (x^-1) + 1 and if i is positive it's x = (x^0) + 0So we can plug this in :
fold_up_i = 2*abs(i) - 1 if i is negative fold_up_i = abs(2i) - 1 if i is negative minus_one_if_i_is_negative = (i >> 31); abs(2i) = (2i ^ minus_one_if_i_is_negative) - minus_one_if_i_is_negative; fold_up_i = abs(2i) + minus_one_if_i_is_negative fold_up_i = (2i) ^ minus_one_if_i_is_negativeor in code :
unsigned fold_up_negatives(int i) { unsigned two_i = ((unsigned)i) << 1; int sign_i = i >> 31; return two_i ^ sign_i; }For unfold we use the same tricks. I'll work it backwards from the answer for variety and brevity. The answer is :
int unfold_negatives(unsigned i) { unsigned half_i = i >> 1; int sign_i = - (int)( i & 1 ); return half_i ^ sign_i; }and let's prove it's right :
if i is even half_i = i>>1; sign_i = 0; return half_i ^ 0 = i/2; // 0,2,4,... -> 0,1,2,... if i is odd half_i = i>>1; // i is odd, this truncates sign_i = -1; return half_i ^ -1 = -half_i -1 = -(i>>1) -1 = -((i+1)>>1) // 1,3,5,... -> -1,-2,-3,...which is what we wanted.
Small note : on x86 you might rather use cdq to get the replicated sign bit of an integer rather than >>31 ; there are probably similar instructions on other architectures. Is there a neat way to make C generate that? I dunno. Not sure it ever matters. In practice you should use an "int32" type or compiler_assert( sizeof(int) == 4 );
Summary :
unsigned fold_up_negatives(int i) { unsigned two_i = ((unsigned)i) << 1; int sign_i = i >> 31; return two_i ^ sign_i; } int unfold_negatives(unsigned i) { unsigned half_i = i >> 1; int sign_i = - (int)( i & 1 ); return half_i ^ sign_i; } |
2 comments:
This is one of these things everyone keeps re-inventing since it's easy to derive and doesn't have an obvious way to Google for it. (One of the few named examples is in Google's protobufs as "zig-zag encoding", but it's either explicit or implicit in a lot of signed VLC constructions).
But my favorite use for this has nothing to do with compression at all: it's the nicest way I know to do a two-sided enumeration expanding outwards from a center point (for iterative deepening-style searches).
Good point about two-sided iteration.
Post a Comment