10-21-08 - 4

A few more hashing notes and then I have to move on :

1. Sean pointed out there are cases where the static jumplist thing totally kicks ass. For example, if your data element is a large object and you have them in a flat array anyway, like eg. its 1024 byte file information record. In this case you don't want a normal hash to hold the data by value because it's large and you don't want to waste a bunch of space on the empties. The jump table can index directly into the flat array of elements, so the only memory use is for the index table itself, which is very small. In fact, the number of indexes need is around (N/4) and the size of the index is 4 bytes, so for N elements the index is just N bytes. Note that you can only index the array on one key this way, because it requires reordering the original array for the jump index.

2. Classic style hashes like khash are really bad as "multimaps" in the STL lingo. When you put in a bunch of identical values it makes them locally N^2 because even fancy reprobing can't get identical values to separate. The STLport style of hash handles being "multi" a lot better. Basically just don't try to make classic hashes be "multi" ever. (to some extent any multi-hash is locally N^2 inherently, but the STLport style can insert quickly, give you an equal_range easily, and then once you have it it's ust a linear walk over all the equal key elements, not a bunch of jumping around the table).

3. For the sorted list you could do an interpolation search instead of a binary search. The hashes, which I store, are actually nicely linearly distributed, so you can get a very good estimate of where to jump in. A pure Interpolation search is very slow and kind of silly actually. Once you do your first interpolation to get an initial index, further interpolation doesn't make much sense. You can almost just do a first interpolation and them linear search from there. In fact I just tried that and it works very well :

Binary search on sorted list : 370 clocks

Full interpolation search : 450 clocks

One interpolation then linear search : 300 clocks

Implicit binary tree search : 290 clocks

STLport hash : 150 clocks

jump list : 114 clocks

So obviously there's something of value there, but the one interpolation is sometimes pretty far off and then it linear searches a whole lot. What it could do is do one interpolation, and then do a binary growing search, like +1,+2,+4 to find the range that the value lies in, and then once it finds the range do a binary or linear search to find the value, but I tried that and it was slower. Also note the fast version above is using >> 32 to divide by the hash range and I assume that the hash is using almost all of the 32 bit range. Divides are way too slow to be in the search.

No comments:

old rants