5/27/2016

PS4 Battle : MiniZ vs Zlib-NG vs ZStd vs Brotli vs Oodle

(see charts at the bottom)

Everything run at max compression options, level 99, max dict size. All libs are the latest on github, downloaded today. Zlib-NG has the arch/x86 stuff enabled. PS4 is AMD Jaguar , x64.

I'm going to omit encode speeds on the per-file results for simplicity, these are pretty representative :


aow3_skin_giants.clb :
zlib-ng encode   : 2.699 seconds, 1.65 b/kc, rate= 2.63 mb/s
miniz encode     : 2.950 seconds, 1.51 b/kc, rate= 2.41 mb/s
zstd encode      : 5.464 seconds, 0.82 b/kc, rate= 1.30 mb/s
brotli-9  encode    : 23.110 seconds, 0.19 b/kc, rate= 307.44 kb/s
brotli-10 encode    : 68.072 seconds, 0.07 b/kc, rate= 104.38 kb/s
brotli-11 encode    : 79.844 seconds, 0.06 b/kc, rate= 88.99 kb/s

Results :

PS4 clang-3.5.0

-------------

lzt99 :

MiniZ : 24,700,820 ->13,120,668 =  4.249 bpb =  1.883 to 1
miniz_decompress_time : 0.292 seconds, 53.15 b/kc, rate= 84.71 mb/s

zlib-ng : 24,700,820 ->13,158,385 =  4.262 bpb =  1.877 to 1
miniz_decompress_time : 0.226 seconds, 68.58 b/kc, rate= 109.30 mb/s

ZStd : 24,700,820 ->10,403,228 =  3.369 bpb =  2.374 to 1
zstd_decompress_time : 0.184 seconds, 84.12 b/kc, rate= 134.07 mb/s

Brotli-9 : 24,700,820 ->10,473,560 =  3.392 bpb =  2.358 to 1
brotli_decompress_time : 0.259 seconds, 59.83 b/kc, rate= 95.36 mb/s

Brotli-10 : 24,700,820 -> 9,949,740 =  3.222 bpb =  2.483 to 1
brotli_decompress_time : 0.319 seconds, 48.54 b/kc, rate= 77.36 mb/s

Brotli-11 : 24,700,820 -> 9,833,023 =  3.185 bpb =  2.512 to 1
brotli_decompress_time : 0.317 seconds, 48.84 b/kc, rate= 77.84 mb/s

Oodle Kraken -zl4 : 24,700,820 ->10,326,584 =  3.345 bpb =  2.392 to 1
encode only      : 4.139 seconds, 3.74 b/kc, rate= 5.97 mb/s
decode only      : 0.073 seconds, 211.30 b/kc, rate= 336.76 mb/s

Oodle Kraken -zl6 : 24,700,820 ->10,011,486 =  3.242 bpb =  2.467 to 1
decode           : 0.074 seconds, 208.83 b/kc, rate= 332.82 mb/s

Oodle Kraken -zl7 : 24,700,820 -> 9,773,112 =  3.165 bpb =  2.527 to 1
decode           : 0.079 seconds, 196.70 b/kc, rate= 313.49 mb/s

Oodle LZNA : lzt99 : 24,700,820 -> 9,068,880 =  2.937 bpb =  2.724 to 1
decode           : 0.643 seconds, 24.12 b/kc, rate= 38.44 mb/s

-------------

normals.bc1 :

miniz :   524,316 ->   291,697 =  4.451 bpb =  1.797 to 1
miniz_decompress_time : 0.008 seconds, 39.86 b/kc, rate= 63.53 mb/s

zlib-ng :   524,316 ->   292,541 =  4.464 bpb =  1.792 to 1
zlib_ng_decompress_time : 0.007 seconds, 47.32 b/kc, rate= 75.41 mb/s

zstd :   524,316 ->   273,642 =  4.175 bpb =  1.916 to 1
zstd_decompress_time : 0.007 seconds, 49.64 b/kc, rate= 79.13 mb/s

brotli-9 :   524,316 ->   289,778 =  4.421 bpb =  1.809 to 1
brotli_decompress_time : 0.010 seconds, 31.70 b/kc, rate= 50.52 mb/s

brotli-10 :   524,316 ->   259,772 =  3.964 bpb =  2.018 to 1
brotli_decompress_time : 0.011 seconds, 28.65 b/kc, rate= 45.66 mb/s

brotli-11 :   524,316 ->   253,625 =  3.870 bpb =  2.067 to 1
brotli_decompress_time : 0.011 seconds, 29.74 b/kc, rate= 47.41 mb/s

Oodle Kraken -zl6 :    524,316 ->   247,217 =  3.772 bpb =  2.121 to 1
decode           : 0.002 seconds, 135.52 b/kc, rate= 215.95 mb/s

Oodle Kraken -zl7 :    524,316 ->   238,844 =  3.644 bpb =  2.195 to 1
decode           : 0.003 seconds, 123.96 b/kc, rate= 197.56 mb/s

Oodle BitKnit :    524,316 ->   225,884 =  3.447 bpb =  2.321 to 1
decode only      : 0.010 seconds, 31.67 b/kc, rate= 50.47 mb/s

-------------

lightmap.bc3 :

miniz :  4,194,332 ->   590,448 =  1.126 bpb =  7.104 to 1 
miniz_decompress_time : 0.025 seconds, 105.14 b/kc, rate= 167.57 mb/s

zlib-ng : 4,194,332 ->   584,107 =  1.114 bpb =  7.181 to 1
zlib_ng_decompress_time : 0.019 seconds, 137.77 b/kc, rate= 219.56 mb/s

zstd :  4,194,332 ->   417,672 =  0.797 bpb = 10.042 to 1 
zstd_decompress_time : 0.014 seconds, 182.53 b/kc, rate= 290.91 mb/s

brotli-9 : 4,194,332 ->   499,120 =  0.952 bpb =  8.403 to 1 
brotli_decompress_time : 0.022 seconds, 118.64 b/kc, rate= 189.09 mb/s

brotli-10 : 4,194,332 ->   409,907 =  0.782 bpb = 10.232 to 1 
brotli_decompress_time : 0.021 seconds, 125.20 b/kc, rate= 199.54 mb/s

brotli-11 : 4,194,332 ->   391,576 =  0.747 bpb = 10.711 to 1 
brotli_decompress_time : 0.021 seconds, 127.12 b/kc, rate= 202.61 mb/s

Oodle Kraken -zl6 :   4,194,332 ->   428,737 =  0.818 bpb =  9.783 to 1 
decode           : 0.009 seconds, 308.45 b/kc, rate= 491.60 mb/s

Oodle BitKnit :   4,194,332 ->   416,208 =  0.794 bpb = 10.077 to 1
decode only      : 0.021 seconds, 122.59 b/kc, rate= 195.39 mb/s

Oodle LZNA :  4,194,332 ->   356,313 =  0.680 bpb = 11.771 to 1 
decode           : 0.033 seconds, 79.51 b/kc, rate= 126.71 mb/s

----------------

aow3_skin_giants.clb

Miniz : 7,105,158 -> 3,231,469 =  3.638 bpb =  2.199 to 1
miniz_decompress_time : 0.070 seconds, 63.80 b/kc, rate= 101.69 mb/s

zlib-ng : 7,105,158 -> 3,220,291 =  3.626 bpb =  2.206 to 1
zlib_ng_decompress_time : 0.056 seconds, 80.14 b/kc, rate= 127.71 mb/s

Zstd : 7,105,158 -> 2,700,034 =  3.040 bpb =  2.632 to 1
zstd_decompress_time : 0.050 seconds, 88.69 b/kc, rate= 141.35 mb/s

brotli-9 :  7,105,158 -> 2,671,237 =  3.008 bpb =  2.660 to 1
brotli_decompress_time : 0.080 seconds, 55.84 b/kc, rate= 89.00 mb/s

brotli-10 : 7,105,158 -> 2,518,315 =  2.835 bpb =  2.821 to 1
brotli_decompress_time : 0.098 seconds, 45.54 b/kc, rate= 72.58 mb/s

brotli-11 : 7,105,158 -> 2,482,511 =  2.795 bpb =  2.862 to 1
brotli_decompress_time : 0.097 seconds, 45.84 b/kc, rate= 73.05 mb/s

Oodle Kraken -zl6 : aow3_skin_giants.clb :  7,105,158 -> 2,638,490 =  2.971 bpb =  2.693 to 1
decode           : 0.023 seconds, 195.25 b/kc, rate= 311.19 mb/s

Oodle BitKnit : 7,105,158 -> 2,623,466 =  2.954 bpb =  2.708 to 1
decode only      : 0.095 seconds, 47.11 b/kc, rate= 75.08 mb/s

Oodle LZNA : aow3_skin_giants.clb :  7,105,158 -> 2,394,871 =  2.696 bpb =  2.967 to 1
decode           : 0.170 seconds, 26.26 b/kc, rate= 41.85 mb/s

--------------------

silesia_mozilla

MiniZ : 51,220,480 ->19,141,389 =  2.990 bpb =  2.676 to 1
miniz_decompress_time : 0.571 seconds, 56.24 b/kc, rate= 89.63 mb/s

zlib-ng : 51,220,480 ->19,242,520 =  3.005 bpb =  2.662 to 1
zlib_ng_decompress_time : 0.457 seconds, 70.31 b/kc, rate= 112.05 mb/s

zstd : malloc failed

brotli-9 : 51,220,480 ->15,829,463 =  2.472 bpb =  3.236 to 1
brotli_decompress_time : 0.516 seconds, 62.27 b/kc, rate= 99.24 mb/s

brotli-10 : 51,220,480 ->14,434,253 =  2.254 bpb =  3.549 to 1
brotli_decompress_time : 0.618 seconds, 52.00 b/kc, rate= 82.88 mb/s

brotli-11 : 51,220,480 ->14,225,511 =  2.222 bpb =  3.601 to 1
brotli_decompress_time : 0.610 seconds, 52.72 b/kc, rate= 84.02 mb/s

Oodle Kraken -zl6 : 51,220,480 ->14,330,298 =  2.238 bpb =  3.574 to 1
decode           : 0.200 seconds, 160.51 b/kc, rate= 255.82 mb/s

Oodle Kraken -zl7 : 51,220,480 ->14,222,802 =  2.221 bpb =  3.601 to 1
decode           : 0.201 seconds, 160.04 b/kc, rate= 255.07 mb/s

Oodle LZNA : silesia_mozilla : 51,220,480 ->13,294,622 =  2.076 bpb =  3.853 to 1
decode           : 1.022 seconds, 31.44 b/kc, rate= 50.11 mb/s

I tossed in tests of BitKnit & LZNA in some cases after I realized that the Brotli decode speeds are more comparable to BitKnit than Kraken, and even LZNA isn't that far off (usually less than a factor of 2). eg. you could do half your files in LZNA and half in Kraken and that would be about the same total time as doing them all in Brotli.


Here are charts of the above data :

(silesia_mozilla omitted due to lack of zstd results)

(I'm trying an experiment and showing inverted scales, which are more proportional to what you care about. I'm showing seconds per gigabyte, and percent out of output size, which are proportional to *time* not speed, and *size* not ratio. So, lower is better.)

log-log speed & ratio :

Time and size are just way better scales. Looking at "speed" and "ratio" can be very misleading, because big differences in speed at the high end (eg. 2000 mb/s vs 2200 mb/s) don't translate into a very big time difference, and *time* is what you care about. On the other hand, small differences in speed at the low end *are* important - (eg. 30 mb/s vs 40 mb/s) because those mean a big difference in time.

I've been doing mostly "speed" and "ratio" because it reads better to the novice (higher is better! I want the one with the biggest bar!), but it's so misleading that I think going to time & size is worth it.

No comments:

old rants