eg. if the nibble is < config_divider_lrl , it's a literal run len; if nibble is >= config_divider_lrl, it's a match.
The point of LZNib is to see how much compression is possible while keeping the decode speed close to the fastest of any reasonable compressor (LZ4,snappy,etc).
Testing different values of config_divider_lrl :
name | raw | config_divider_lrl=4 | config_divider_lrl=5 | config_divider_lrl=6 | config_divider_lrl=7 | config_divider_lrl=8 | config_divider_lrl=9 | config_divider_lrl=10 | config_divider_lrl=11 | config_divider_lrl=12 | best |
lzt00 | 16914 | 5639 | 5638 | 5632 | 5636 | 5654 | 5671 | 5696 | 5728 | 5771 | 5632 |
lzt01 | 200000 | 199360 | 199354 | 199348 | 199345 | 199339 | 199333 | 199324 | 199319 | 199314 | 199314 |
lzt02 | 755121 | 244328 | 243844 | 250836 | 255146 | 255177 | 255257 | 257754 | 260107 | 260597 | 243844 |
lzt03 | 3471552 | 1746220 | 1744630 | 1743728 | 1743043 | 1742718 | 1742728 | 1743191 | 1744496 | 1746915 | 1742718 |
lzt04 | 48649 | 13932 | 13939 | 13968 | 14015 | 14120 | 14184 | 14268 | 14319 | 14507 | 13932 |
lzt05 | 927796 | 422058 | 421115 | 420746 | 420592 | 418289 | 418200 | 418639 | 418854 | 418082 | 418082 |
lzt06 | 563160 | 414925 | 414080 | 412748 | 412223 | 409673 | 408884 | 408361 | 408435 | 407393 | 407393 |
lzt07 | 500000 | 237756 | 237318 | 237004 | 236910 | 236771 | 236949 | 237381 | 238091 | 239176 | 236771 |
lzt08 | 355400 | 309397 | 309490 | 308579 | 307706 | 307263 | 306418 | 305689 | 305495 | 305793 | 305495 |
lzt09 | 786488 | 302834 | 303018 | 303773 | 304350 | 305222 | 307405 | 308888 | 310649 | 314647 | 302834 |
lzt10 | 154624 | 11799 | 11785 | 11792 | 11800 | 11821 | 11843 | 11866 | 11885 | 11923 | 11785 |
lzt11 | 58524 | 22420 | 22341 | 22249 | 22288 | 22276 | 22322 | 22370 | 22561 | 22581 | 22249 |
lzt12 | 164423 | 28901 | 28974 | 28900 | 28957 | 29053 | 29122 | 29296 | 29381 | 29545 | 28900 |
lzt13 | 1041576 | 1072275 | 1068614 | 1058545 | 1047273 | 1025641 | 1025616 | 1025520 | 1025404 | 1024891 | 1024891 |
lzt14 | 102400 | 52010 | 51755 | 51595 | 51462 | 51379 | 51314 | 51298 | 51302 | 51341 | 51298 |
lzt15 | 34664 | 11846 | 11795 | 11767 | 11760 | 11740 | 11740 | 11756 | 11831 | 11837 | 11740 |
lzt16 | 21504 | 11056 | 11000 | 10961 | 10934 | 10911 | 10904 | 10893 | 10883 | 10892 | 10883 |
lzt17 | 53161 | 20122 | 20119 | 20152 | 20210 | 20288 | 20424 | 20601 | 20834 | 21091 | 20119 |
lzt18 | 102400 | 77317 | 77307 | 77274 | 77045 | 77037 | 77020 | 77006 | 76964 | 76976 | 76964 |
lzt19 | 768771 | 306499 | 307120 | 308138 | 309635 | 311801 | 314857 | 318983 | 323981 | 329683 | 306499 |
lzt20 | 1179702 | 975546 | 974447 | 973507 | 972326 | 971521 | 972060 | 971614 | 971569 | 985009 | 971521 |
lzt21 | 679936 | 99059 | 99182 | 99385 | 99673 | 100013 | 100492 | 101018 | 101652 | 102387 | 99059 |
lzt22 | 400000 | 334796 | 334533 | 334357 | 334027 | 333860 | 333733 | 333543 | 333501 | 337864 | 333501 |
lzt23 | 1048576 | 1029556 | 1026539 | 1023978 | 1021833 | 1019900 | 1018124 | 1016552 | 1015139 | 1013815 | 1013815 |
lzt24 | 3471552 | 1711694 | 1710524 | 1708577 | 1706969 | 1696663 | 1695663 | 1694205 | 1692996 | 1688324 | 1688324 |
lzt25 | 1029744 | 224428 | 224423 | 224306 | 229365 | 229362 | 229368 | 229603 | 227083 | 227546 | 224306 |
lzt26 | 262144 | 240106 | 239633 | 239200 | 238864 | 238538 | 238232 | 237960 | 237738 | 237571 | 237571 |
lzt27 | 857241 | 323147 | 323098 | 323274 | 323133 | 322050 | 322068 | 322799 | 322182 | 322573 | 322050 |
lzt28 | 1591760 | 343555 | 345586 | 348549 | 350601 | 352455 | 354077 | 356025 | 360583 | 364438 | 343555 |
lzt29 | 3953035 | 1445657 | 1442589 | 1440996 | 1440794 | 1437132 | 1437593 | 1440565 | 1442614 | 1442914 | 1437132 |
lzt30 | 100000 | 100668 | 100660 | 100656 | 100655 | 100653 | 100651 | 100651 | 100643 | 100643 | 100643 |
total | 24700817 | 12338906 | 12324450 | 12314520 | 12308570 | 12268320 | 12272252 | 12283315 | 12296219 | 12326039 | 12212820 |
The divider at 8 is the same as using a flag bit + 3 bit payload in the nibble.
There does seem to be some win to be had by transmitting divider in the file (but it's not huge). Adaptive divider seems like an interesting thing to explore also, but it will make the decoder slower. Obviously it would be nice to be able to find the optimal divider for each file without just running the compressor N times.
Some comparison to other compressors. I tried to run all compressors with settings for max compression and max decode speed.
I'm having a bit of trouble finding anything to compare LZNib to. Obviously the LZ4+Large Window (LZ4P) that I talked about before is a fair challenger. I thought CRUSH would be the one, but it does very poorly on the big files, indicating that it has a small window (?). If anyone knows of a strong byte-aligned LZ that has large/unbounded window, let me know so I have something to compare against.
(ADDENDUM : tor -c1 can do byte-aligned IO and large windows, but it's a really bad byte-aligned coder)
(obviously things like LZ4 and snappy are not fair challengers because their small window makes them much worse; also note that zlib of course has a small window but is the only one with entropy coding; also several of these are not truly "byte aligned"; they have byte-aligned literals but do matches and control words bit by bit, which is "cheating" in this test (if you're allowed to do variable bit IO you can do much better; hell you may as well do huffman if you're doing variable bit IO) (though I'm also cheating because I believe I'm the only LZ here with a proper optimal parser) (I'm also cheating in a more subtle way, because I have this test set to tweak on and others don't)).
(aside : I don't understand why so many people are still writing small-window LZ's. Yes there is a certain use case for small-window LZ (eg. I did one for use on the SPU), but the most common use case for LZ these days is just "decompress the whole thing into a linear memory buffer", in which case you always have the entire buffer available for matches, so why not use it all?)
raw | minilzo | snappy | quicklz 3 | lz4 hc | lzopack -9 | ULZ c6 | crush cx | zlib | lz4p 332 | lznib div8 | |
lzt00 | 16914 | 7195 | 7254 | 6082 | 6473 | 5805 | 6472 | 5487 | 4896 | 6068 | 5654 |
lzt01 | 200000 | 198906 | 198222 | 200009 | 198900 | 198934 | 199680 | 222477 | 198199 | 198880 | 199339 |
lzt02 | 755121 | 567421 | 552599 | 334625 | 410695 | 426187 | 312889 | 258902 | 386203 | 292427 | 255177 |
lzt03 | 3471552 | 2456002 | 2399663 | 2036985 | 1820761 | 1854718 | 1835797 | 1938138 | 1789728 | 1795951 | 1742718 |
lzt04 | 48649 | 20602 | 21170 | 16894 | 16709 | 14858 | 17359 | 13869 | 11903 | 15584 | 14120 |
lzt05 | 927796 | 590072 | 554964 | 480848 | 460889 | 453608 | 464602 | 444945 | 422484 | 440742 | 418289 |
lzt06 | 563160 | 536084 | 516818 | 563169 | 493055 | 490308 | 428137 | 432989 | 446533 | 419768 | 409673 |
lzt07 | 500000 | 297306 | 298207 | 268255 | 265688 | 242029 | 268271 | 245662 | 229426 | 248500 | 236771 |
lzt08 | 355400 | 356248 | 351850 | 317102 | 331454 | 314918 | 337423 | 315688 | 277666 | 322959 | 307263 |
lzt09 | 786488 | 460896 | 472498 | 372682 | 344792 | 345561 | 329608 | 304048 | 325921 | 325124 | 305222 |
lzt10 | 154624 | 21355 | 25960 | 17249 | 15139 | 14013 | 16238 | 12540 | 12577 | 13299 | 11821 |
lzt11 | 58524 | 28153 | 29121 | 25626 | 25832 | 23717 | 26720 | 23279 | 21637 | 23870 | 22276 |
lzt12 | 164423 | 52725 | 53515 | 38745 | 33666 | 31574 | 36077 | 29601 | 27583 | 30864 | 29053 |
lzt13 | 1041576 | 1045665 | 1041684 | 1041585 | 1042749 | 1041633 | 1048598 | 1061423 | 969636 | 1040033 | 1025641 |
lzt14 | 102400 | 61394 | 63638 | 55124 | 56525 | 52102 | 58509 | 51491 | 48155 | 53395 | 51379 |
lzt15 | 34664 | 16026 | 15417 | 13626 | 14062 | 12663 | 14016 | 12470 | 11464 | 12723 | 11740 |
lzt16 | 21504 | 12858 | 13165 | 11646 | 12349 | 11203 | 12554 | 11119 | 10311 | 11392 | 10911 |
lzt17 | 53161 | 28075 | 29415 | 23478 | 23141 | 20979 | 23829 | 19877 | 18518 | 22028 | 20288 |
lzt18 | 102400 | 100499 | 97931 | 81100 | 85659 | 74268 | 89973 | 76858 | 68392 | 79138 | 77037 |
lzt19 | 768771 | 495312 | 524686 | 411916 | 363217 | 360558 | 333732 | 302006 | 312257 | 335912 | 311801 |
lzt20 | 1179702 | 1181855 | 1161896 | 1037098 | 1045179 | 1042190 | 1013392 | 952329 | 952365 | 993442 | 971521 |
lzt21 | 679936 | 240528 | 240244 | 188446 | 194075 | 174892 | 125322 | 103608 | 148267 | 113461 | 100013 |
lzt22 | 400000 | 401446 | 397959 | 338837 | 361733 | 354449 | 355978 | 321598 | 309569 | 348347 | 333860 |
lzt23 | 1048576 | 1052692 | 1048694 | 1048585 | 1040701 | 1030737 | 1047609 | 985814 | 777633 | 1035197 | 1019900 |
lzt24 | 3471552 | 2668424 | 2613405 | 2425865 | 2369885 | 2521469 | 2040395 | 2080506 | 2289316 | 1934129 | 1696663 |
lzt25 | 1029744 | 361885 | 425735 | 351577 | 324190 | 326180 | 477377 | 297974 | 210363 | 332747 | 229362 |
lzt26 | 262144 | 261152 | 259327 | 244555 | 246465 | 242343 | 252297 | 237162 | 222808 | 244990 | 238538 |
lzt27 | 857241 | 460703 | 466747 | 435522 | 430350 | 395926 | 392139 | 335655 | 333120 | 353497 | 322050 |
lzt28 | 1591760 | 578289 | 617498 | 453170 | 445806 | 421804 | 401166 | 349753 | 335243 | 388712 | 352455 |
lzt29 | 3953035 | 2570903 | 2535625 | 2259281 | 2235299 | 2052597 | 2227835 | 2013763 | 1805289 | 1519904 | 1437132 |
lzt30 | 100000 | 100397 | 100015 | 100009 | 100394 | 100033 | 100782 | 112373 | 100020 | 100393 | 100653 |
total | 24700817 | 17231068 | 17134922 | 15199691 | 14815832 | 14652256 | 14294776 | 13573404 | 13077482 | 13053476 | 12268320 |
BTW I finally got most of these compressors linked into my own test bed for this post. Review of their API's (and some ranting) :
minilzo , quicklz, LZ4 : yes, good, easy. API is basically compress(char *,int len,char *), as it should be.
One minor niggle with these guys : I'm actually leaning towards compress API's taking "void *" now for buffers. It's annoying dealing with API's where somebody thinks "char *" is the way to take an array and someone else wants "const unsigned char *", etc. If it's just memory, take void *, I think. I'm not completely sure about that.
Oh, also "uncompress" ? Where did that come from? It's "decompress". Come on!
Oh, and about LZO : so minilzo is nice and clean, but it only gives you access to the most crappy compressor. To get the good LZO variants you have to use the full LZO sdk which is a nightmare. So I didn't bother and just ran lzopack -9.
miniz : I don't love the single file miniz.c thing; I'd like to have an implementation .c and a header with the externs. It makes it way harder to read the externs and see what is supposed to be public. I also don't like the custom types (mz_uLong and such), I like "int" (I guess "size_t" is okay). But not bad, way better than :
zlib : WTF WTF ridiculous clusterfuck. There's an MSVC make file that seems to make a lib that is not what you want. Then the contrib has some projects for two MSVC versions and not others; then you have to make sure that you get the DLL import and WINAPI defines in your import the same as the built lib. Much hair pulling due to "_deflate not found" link errors.
(BTW I don't entirely blame zlib for that; how many man hours have been wasted due to the linker being so damn unhelpful. Hey jerks who work on the linker, if I'm trying to import "__stdcall deflate" and you only have "__cdecl deflate" then perhaps you could figure out that just maybe I wanted those to hook up. We've only been having this damn problem for the last 20 years. It's the kind of thing where one decent human being on that team would go "hmm I bet we could do something better here". The linker is like the overly precise nerd; if you ask "can you pass the salt" he's like "no, I don't have any salt" , and you're like "fuck you, I can see the salt right next to you" and he's like "nope, no salt", and you're like wtf wtf so you walk around and grab it and you say "okay, what's this?" and he says "that's kosher salt").
Compressed sizes from zlib were slightly smaller than miniz, so I posted them.
lzma : this is a mixed bag; the C++ interface is a total disaster of COM insanity, but you can avoid that and the C interface (LzmaEnc/LzmaDec) is okay. A bit of a mess and you oddly have to handle the props transmission yourself, but so much better than the C++ interface that it looks great in comparison. Very bizarre to expose so many technical encoder settings in the basic API though.
snappy : WTF ? No MSVC build, it uses some crazy STL crap for no reason (iostreams? are you kidding me? using std::string for buffers? WTF), it's all overly complex and bad-C++y. Really gross. There's a mess of files and I can't tell which ones I actually need. Oh, and it generates warnings like crazy; so it's all bloated "clean style" but not actually clean, which is the worst way to be. And then after futzing around with it, I'm rewarded with a really crappy compressor.
I guess I see the point of snappy; it's LZRW1 for modern CPUs. Okay, that's fine. And it is reasonably fast even on incompressible data. So it has a certain use case, perhaps for disk compression, or I guess Google uses it for network data compression or some such. But people have been taking snappy and using it as just a utility data compressor and it is totally terrible for that.
12 comments:
Yeah, Snappy is mostly for RPC. I think most of the pain you're seeing is from the fact that the internal implementation doesn't actually use the C++-y things like iostreams, but whomever open sourced it just jammed it in there. Personally, I don't see why you would use it over LZ4, unless you were already using it everywhere (like we are).
A while back (I might have even asked you), I was going to make a nibble-oriented LZP. Hmm...
Speaking of compressor APIs, did you try lzham? Thoughts on it?
There's also aPLib. Was really good for a pure LZ77 without entropy encoding around 2000 or so, haven't compared it to anything recently.
It's part of the "byte-aligned literals but bitwise control codes" camp though. Good parser (well-tweaked heuristic, not optimal), and IIRC it has unlimited window size (intended for all-in-memory decompression). And an API that doesn't suck (no code for the compressor though, but if you already have an optimal parser it's trivial to write an encoder).
Code is here: http://www.ibsensoftware.com/products_aPLib.html
@Won - yeah as I was doing this I started thinking about LZP-Nib. (I see from my email history search that you did in fact mail me about it long ago).
I suspect that you don't want to do a "pure" LZP (with only one match string), but rather use a few "ways" in the cache table, or perhaps a few bits of forward hash.
Of course this is mostly just rehashing old ideas :
http://cbloomrants.blogspot.com/2011/05/05-20-11-lzp1-variants.html
Also quicklz is a forward-hash LZP (LZP1c) if I'm understanding it correctly.
LZHAM needs this -
int DictLog2(int rawLen)
{
int l2 = cb::intlog2ceil( (float)rawLen );
return cb::Clamp(l2,LZHAM_MIN_DICT_SIZE_LOG2,LZHAM_MAX_DICT_SIZE_LOG2_X86);
}
would you be willing to try gipfeli?
http://code.google.com/p/gipfeli/
gipfeli needs at least a download .zip, and really there should be a Win32 .exe . It also appears to be entropy coded using a simple bit-packing scheme.
I believe that if you're doing bit-packing you may as well just do huffman. But perhaps if your speed/ratio is good you would convince me otherwise.
@ryg - I tested aPLib; compression is very similar to CRUSH
@gipfeli - I'm crankier than usual this morning. But god damn I'm sick of dealing with Google Code. (sourceforge and github and etc etc are no better). The main page should always just be a link to download a zip of the full source tree.
cbloom, if you're looking for a strong LZ77, take a look at LZMAT. It's not entirely about strength, but on some kinds of data it does extremely well, f.e I think it supports arbitrary match length with logarithmic length encoding.
As to aPlib, last time I checked it didn't come with encoder sources...did I miss something?
Re : aPlib - Correct, I just ran the EXE to see if the compression it achieved was interesting. It was not.
Re : LZMAT - yeah I tried it; compression ratios are somewhere around LZ4/CRUSH . It does have slightly different characteristics than the others, but it's definitely small window. Total on my set is 14518303
I know that this is a very old post. I just wanted to comment that aPLib is not a pure LZ77. It has a somewhat strange mechanism for re-using offsets, where instead of including a required offset, it can re-use the offset used the last time when the match was found.
Post a Comment