The standard solution is Fenwick Trees . They are compact (take no more room than the C[i] table itself). They are O(log(N)) fast. In code :

F[i] contains a partial sum of some binary range the size of the binary range is equal to the bottom bit on of i if i&1 - it contains C[i] if i&3 == 2 - contains Sum of 2 ending with i (eg. C[i]+C[i-1] ) if i&7 == 4 - contains Sum of 4 ending with i if i&F == 8 - contains Sum of 8 ending with i(follow the link above to see pictures). To get C[i] from F[i] you basically have to get the partial sums S[i] and S[i-1] and subtract them (you can terminate early when you can tell that the rest of their sum is a shared walk). The logN walk to get S from F is very clever :

sum = 0; while (i > 0) { sum += F[i]; i = i & (i-1); }The i & (i-1) step turns off the bottom bit of i, which is the magic of the Fenwick Tree structure being the same as the structure of binary integers. (apparently this is the same as i -= i & -i , though I haven't worked out how to see that clearly).

If you put F[0] = 0 (F starts indexing at 1), then you can do this branchless if you want :

sum = 0; UNROLL8( sum += F[i]; i = i & (i-1); );(for an 8-bit symbol, eg 256 elements in tree).

You can't beat this. The only sucky thing is that just querying a single probability is also O(logN). There are some cases where you want to query probability more often than you do anything else.

One solution to that is to just store the C[i] array as well. That doesn't hurt your update time much, and give you O(1) query for count, but it also doubles your memory use (2*N ints needed instead of N).

One option is to keep C[i], and throw away the bottom level of the Fenwick tree (the odd indexes that just store C[i]). Now your memory use is (3/2)*N ; it's just as fast but a little ugly.

But I was thinking what if we start over. We have the C[i], what if we just build a tree on it?

The most obvious thing is to build a binary partial sum tree. At level 0 you have the C[i], at level 1 you have the sum of pairs, at level 2 you have the sum of quartets, etc :

showing the index that has the sum of that slot : 0:01234567 1:00112233 2:00001111 3:00000000So update is very simple :

Tree[0][i] ++; Tree[1][i>>1] ++; Tree[2][i>>2] ++; ...But querying a cumprob is a bit messy. You can't just go up the tree and add, because you may already be counted in a parent. So you have to do something like :

sum = 0; if ( i&1 ) sum += Tree[0][i-1]; i>>=1; if ( i&1 ) sum += Tree[1][i-1]; i>>=2; if ( i&1 ) sum += Tree[1][i-1]; ..This is O(logN) but rather uglier than we'd like.

So what if we instead design our tree to be good for query. So we by construction say that our query for cumprob will be this :

sum = Tree[0][i]; sum += Tree[1][i>>1]; sum += Tree[2][i>>2]; ...That is, at each level of the tree, the index (shifted down) contains the amount that should be added on to get the partial sum that preceeds you. That is, if i is >= 64 , then Tree[6][1] will contain the sum from [0..63] and we will add that on.

In particular, at level L, if (i>>L)is odd , it should contain the sum of the previous 2^L items. So how do we do the update for this?

Tree[0][i] ++; i >>= 1; if ( i&1 ) Tree[1][i] ++; i >>= 1; if ( i&1 ) Tree[2][i] ++; ... or Tree[0][i] ++; if ( i&2 ) Tree[1][i>>1] ++; if ( i&4 ) Tree[2][i>>2] ++; ... or Tree[0][i] ++; i >>= 1; Tree[1][i] += i&1; i >>= 1; Tree[2][i] += i&1; ...this is exactly complementary to the query in the last type of tree; we've basically swapped our update and query.

Now if you draw what the sums look like for this tree you get :

These are the table indexes :

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | ||||||||

0 | 1 | 2 | 3 | ||||||||||||

0 | 1 | ||||||||||||||

0 |

These are the table contents :

C0 | C0-C1 | C2 | C2-C3 | C4 | C4-C5 | C6 | C6-C7 | C8 | C8-C9 | C10 | C10-C11 | C12 | C12-C13 | C14 | C14-C15 |

0 | C0-C1 | 0 | C4-C5 | 0 | C8-C9 | 0 | C12-C13 | ||||||||

0 | C0-C3 | 0 | C8-C11 | ||||||||||||

0 | C0-C7 | ||||||||||||||

0 |

Now what if we slide everything left so we don't have all those zeros in the front, and we'll go ahead and
stick the total sum in the top :

C0 | C0-C1 | C2 | C2-C3 | C4 | C4-C5 | C6 | C6-C7 | C8 | C8-C9 | C10 | C10-C11 | C12 | C12-C13 | C14 | C14-C15 |

C0-C1 | 0 | C4-C5 | 0 | C8-C9 | 0 | C12-C13 | 0 | ||||||||

C0-C3 | 0 | C8-C11 | 0 | ||||||||||||

C0-C7 | 0 | ||||||||||||||

C0-C15 |

Now, for each range, we stuff the value into the top slot that it covers, starting with the largest range. So (C0-C7) goes in slot 7 , (C0-C3) goes in slot 3, etc :

C0 | -v- | C2 | -v- | C4 | -v- | C6 | -v- | C8 | -v- | C10 | -v- | C12 | -v- | C14 | -v- |

C0-C1 | -v- | C4-C5 | -v- | C8-C9 | -v- | C12-C13 | -v- | ||||||||

C0-C3 | -v- | C8-C11 | -v- | ||||||||||||

C0-C7 | -v- | ||||||||||||||

C0-C15 |

(the -v- is supposed to be a down-arrow meaning you get those cell contents from the next row down). Well, this is exactly a Fenwick Tree.

I'm not sure what the conclusion is yet, but I thought that was like "duh, cool" when I saw it.

## No comments:

Post a Comment