(the compressors are the ones I can compile into my testbed and run from code, eg. not command line apps; they are all run memory to memory; I tried to run all compressors in max-compression-max-decode-speed mode , eg. turning on heavy options on the encode side. Decode times are generated by running each decompressor 10 times locked to each core of my box (80 total runs) and taking the min time; the cache is wiped before each decode. Load times are simulated by dividing the compressed file size by the disk speed parameter. All decoders were run single threaded.)
They are sorted by fastest decompressor. (the "raw" uncompressed file takes zero time to decompress).
"sum" = the sum of decomp + load times. This is the latency if you load the entire compressed file and then decompress in series.
"max" = the max of decomp & load times. This is the latency if the decomp and load were perfectly parallelizable, and neither ever stalled on the other.
The criterion you actually want to use is something between "sum" and "max", so the idea is you look at them both and kind of interpolate in your mind. (obviously you can replace "disk" with ram or network or "channel")
The compressors are ordered from left to right by speed. If you look at the chart of compressed file sizes, they should be going monotonically downward from left to right. Any time it pops up as you go to the right (eg. at snappy, minilzo, miniz, zlib) that is just a bad compressor; it has no hope of being a good choice in terms of space/speed tradeoff. The only ones that are on the "Pareto frontier" are raw, LZ4, OodleLZH, and LZMA.
Basically what you should see is that on a fast disk (100 mbps (and mb = millions of bytes, not 1024*1024)), a very slow decoder like LZMA does not make a lot of sense, you spend way too much time in decomp. On very slow data channels (like perhaps over the net) it starts to make sense, but you have to get to 5 mbps or slower before it becomes a clear win. (of course there may be other reasons that you want your data very small other than minimizing time to load; eg. if you are exceeding DVD capacity).
On a fast disk, the fast decompressors like LZ4 are appealing. (though even at 100 mbps, OodleLZH has a lower "max"; LZ4 has the best "sum").
Of the fast decoders, LZ4 is just the best. (in fact LZ4-largewindow would be even stronger). Zip is pretty poor; the small window is surely hurting it, it doesn't find enough matches which not only hurts compression, it hurts decode speed. Part of the problem is neither miniz nor zlib have super great decoders with all the tricks.
It's kind of ridiculous that we don't have a single decent mainstream free compression library. Even just zip-largewindow would be at least decent. (miniz could easily be extended to large windows; that would make it a much more competitive compressor for people who don't care about zlib compatibility)
If you are fully utilizing your CPU's, you may need a low-CPU decoder even if it's not the best choice in a vacuum. In fact because of that you should avoid CPU-hungry decoders even if they are the best by some simple measure like time to load. eg. even in cases where LZMA does seem like the right choice, if it's close you should avoid it, because you could use that CPU time for something else. You could say that any compressor that can decode faster than it can load compressed data is "free"; that is, the decode time can be totally hidden by parallelizing with the IO and you can saturate the disk loading compressed data. While that is true it assumes no other CPU work is being done, so does not apply to games. (it does sort of apply to archivers or installers, assuming you aren't using all the cores).
As a rough rule of thumb, compressors that are in the "sweet spot" take time that is roughly on par with the disk time to load their compressed data. That is, maybe half the time, or double the time, but not 1/10th the time of the disk (then they are going too fast, compressing too little, leaving too much on the table), and also not 10X the time of the disk (then they are just going way too slow and you'd be better off with less compression and a faster compressor).
The other thing we can do is draw the curves and see who's on the pareto frontier.
Here I make the Y axis the "effective mbps" to load and then decompress (sequentially). Note that "raw" is an x=y line, because the effective speed equals the disk speed.
Let me emphasize that these charts should be evaluated as information that goes into a decision. You do not just go "hey my disk is 80 mbps let me see which compressor is on top at that speed" and go use that. That's very wrong.
and the log-log (log base 2) :
You can see way down there at the bottom of the log-log, where the disk speed is 1.0 mbps, LZMA finally becomes best. Also note that log2 of 10 is a gigabyte per second, almost the speed of memory.
Some intuition about log-log compressor plots :
Over on the right hand side, all the curves will flatten out and become horizontal. This is the region where the decompress time dominates and the disk speed becomes almost irrelevant (load time is very tiny compared to decompress time). You see LZMA flattens out at relatively low disk speed (at 16 mbps (log2 = 4) it's already pretty flat). The speed over at the far right approaches the speed of just the decompressor running memory to memory.
On the left all the curves become straight lines with a slope of 1 (Y = X + B). In this area their total time is dominated by their loading time, which is just a constant times the disk speed. In a log-log plot this constant multiple becomes a constant addition - the Y intercept of each curve is equal to log2(rawLen/compLen) ; eg someone with 2:1 compression will hit the Y axis at log2(2) = 1.0 . You can see them stacked up hitting the Y axis in order of who gets the most compression.
Another plot we can do is the L2 mean of load time and decode time (sqrt of squares). What the L2 mean does is penalize compressors where the load time and decode time are very different (it favors ones where they are nearly equal). That is, it sort of weights the average towards the max. I think this is actually a better way to rate a compressor for most usages, but it's a little hand-wavey so take it with some skepticism.