(LZSSE Latest commit c22a696 ; fetched 03/06/2016 ; test machine Core i7-3770 3.4 GHz ; built MSVC 2012 x64 ; LZSSE2 and 8 optimal parse level 16)
Basically LZSSE is in fact great on text, faster than LZ4 and much better compression.
On binary, LZSSE2 is quite bad, but LZSSE8 is roughly on par with LZ4. It looks like LZ4 is maybe slightly better on binary than LZSSE8, but it's close.
In general, LZ4 is does well on files that tend to have long LRL's and long ML's. Files with lots of short (or zero) LRL's and short ML's are bad for LZ4 (eg. text) and not bad for LZSSE.
(LZB16 is Oodle's LZ4 variant; 64k window like LZSSE; LZNIB and LZBLW have large windows)
Some results :
enwik8 LZSSE2 : 100,000,000 ->38,068,528 : 2866.17 mb/s
enwik8 LZSSE8 : 100,000,000 ->38,721,328 : 2906.29 mb/s
enwik8 LZB16 : 100,000,000 ->43,054,201 : 2115.25 mb/s
(LZSSE kills on text)
lzt99 LZSSE2 : 24,700,820 ->15,793,708 : 1751.36 mb/s
lzt99 LZSSE8 : 24,700,820 ->15,190,395 : 2971.34 mb/s
lzt99 LZB16 : 24,700,820 ->14,754,643 : 3104.96 mb/s
(LZSSE2 really slows down on heterogenous binary file lzt99)
(LZSSE8 does okay, but slightly worse than LZ4/LZB16 in size & speed)
mozilla LZSSE2: 51,220,480 ->22,474,508 : 2424.21 mb/s
mozilla LZSSE8: 51,220,480 ->22,148,366 : 3008.33 mb/s
mozilla LZB16 : 51,220,480 ->22,337,815 : 2433.78 mb/s
(all about the same size on silesia mozilla)
(LZSSE8 definitely fastest)
lzt24 LZB16 : 3,471,552 -> 2,379,133 : 4435.98 mb/s
lzt24 LZSSE8 : 3,471,552 -> 2,444,527 : 4006.24 mb/s
lzt24 LZSSE2 : 3,471,552 -> 2,742,546 : 1605.62 mb/s
lzt24 LZNIB : 3,471,552 -> 1,673,034 : 1540.25 mb/s
(lzt24 (a granny file) really terrible for LZSSE2; it's as slow as LZNIB)
(LZSSE8 fixes it though, almost catches LZB16, but not quite)
------------------
Some more binary files. LZSSE2 is not good on any of these, so omitted.
win81 LZB16 : 104,857,600 ->54,459,677 : 2463.37 mb/s
win81 LZSSE8 : 104,857,600 ->54,911,633 : 3182.21 mb/s
all_dds LZB16 : 79,993,099 ->47,683,003 : 2577.24 mb/s
all_dds LZSSE8: 79,993,099 ->47,807,041 : 2607.63 mb/s
AOW3_Skin_Giants.clb
LZB16 : 7,105,158 -> 3,498,306 : 3350.06 mb/s
LZSSE8 : 7,105,158 -> 3,612,433 : 3548.39 mb/s
baby_robot_shell.gr2
LZB16 : 58,788,904 ->32,862,033 : 2968.36 mb/s
LZSSE8 : 58,788,904 ->33,201,406 : 2642.94 mb/s
LZSSE8 vs LZB16 is pretty close.
LZSSE8 is maybe more consistently fast; its decode speed has less variation than LZ4. Slowest LZSSE8 was all_dds at 2607 mb/s ; LZ4 went down to 2115 mb/s on enwik8. Even excluding text, it was down to 2433 mb/s on mozilla. LZB16/LZ4 had a slightly higher max speed (on lzt24).
Conclusion :
On binary-like data, LZ4 and LZSSE8 are pretty close. On text-like data, LZSSE8 is definitely better. So for general data, it looks like LZSSE8 is a definite win.