cbloom rants: 10-02-12 - Small note on LZHAM

10/02/2012

10-02-12 - Small note on LZHAM

When I did my comparison of various compressors a little while ago, I also tested LZHAM, but I didn't include it in the charts because the numbers I was seeing from it were very strange. In particular, I saw very very slow decode speeds, which surprised me because it seems to test well in other peoples' benchmarks.

So I finally had a deeper look to sort it out. The short answer is that LZHAM has some sort of very long initialization (even for just the decoder) which makes its speed extremely poor on small buffers. I was seeing speeds like 2 MB/sec , much worse than LZMA (which generally gets 10-25 MB/sec on my machine). (this is just from calling lzham_lib_decompress_memory)

On large buffers, LZHAM is in fact pretty fast (some numbers below). The space-speed is very good (on large buffers); it gets almost LZMA compression with much faster decodes. Unfortunately the breakdown on small buffers makes it not a general solution at the moment IMO (it's also very slow on incompressible and nearly-incompressible data). I imagine it's something like the huffman table construction is very slow, which gets hidden on large files but dominates small ones.

Anyhoo, here are some numbers. Decode shows mb/s.

BTW BEWARE : don't pay too much attention to enwik8 results; compressing huge amounts of text is irrelevant to almost all users. The results on lzt99 are more reflective of typical use.

name	lzt99	decode
raw	24700820	inf
lz4	14814442	1718.72
zlib	13115250	213.99
oodlelzhlw	10164511	287.54
lzham	10066153	61.24
lzma	9344463	29.77

name	enwik8	dec
raw	100000000	inf
lz4	42210253	1032.34
zlib	36445770	186.96
oodlelzhlw	27729121	258.46
lzham	24769055	103.01
lzma	24772996	54.59

(lzma should beat lzham on enwik8 but I can't be bothered to fiddle with all the compress options to find the ones that make it win; this is just setting both to "uber" (and -9) parse level and setting dict size = 2^29 for both)

And some charts for lzt99. See the previous post on how to read the charts .

6 comments:

said...: Can't LZMA2 do multi-threaded encode/decode?; October 2, 2012 at 5:41 PM
said...: ah, so can LZHAM. I'd be interested to see which is better multi-threaded. I expect the same improvement, but the implementations might differ.; October 2, 2012 at 5:45 PM
cbloom said...: Neither one can multi-thread decode, which is what I am timing. Both can multi-thread encode in different ways.; October 2, 2012 at 6:11 PM
Rich Geldreich said...: Thanks for trying out LZHAM. This is a very interesting way of testing codecs. The current decompressor suffers from having to clear too many Huffman context tables at startup. I need to optimize this (maybe add a lazy clearing scheme that only clears those tables that are actually used), and also offer a choice to use less contexts (trading off some compression for faster init).

Also, the latest version of LZHAM (currently only checked into SVN) has faster startup times than the latest downloadable version (I reduced the amount of memory allocation that was ocuring during decompressor init). I need to get off my butt and release it.; October 2, 2012 at 6:12 PM
cbloom said...: BTW an obvious way to multi-thread an LZ77 decoder (without just chunking, which costs compression) would be have one thread doing all the code stream unpacking and another thread read the unpacked code stream and write out the match copies and literals to the output buffer. Should provide some speedup. Not really a great way to use your CPU though.; October 2, 2012 at 6:13 PM
cbloom said...: @Rich - Yeah, LZHAM does worst on really small files (16k or less). Obviously one option for a user is just to drop down to a simpler compressor (eg. zip) on tiny buffers.

Also the Huffman fast decode tables should use fewer peek bits on small files.; October 2, 2012 at 6:20 PM

cbloom rants

10/02/2012

10-02-12 - Small note on LZHAM

6 comments:

old rants