cbloom rants

1/09/2012

01-09-12 - LZ Optimal Parse with A Star Part 5

Wrapping up the series with lots of numbers.

Previous parts :

cbloom rants 12-17-11 - LZ Optimal Parse with A Star Part 4
cbloom rants 12-17-11 - LZ Optimal Parse with A Star Part 3
cbloom rants 12-17-11 - LZ Optimal Parse with A Star Part 2
cbloom rants 10-24-11 - LZ Optimal Parse with A Star Part 1

I'm delirious with fever right now so I might write something inane, but I'm so bored of lying in bed so I'm trying to wrap this up. Anyhoo..

So first of all we have to talk a bit about what we're comparing the A Star parse to.

"Normal" is a complex forward lazy parse using heuristics to guide parsing, as described in Part 1. "Fast" is like Normal but uses simpler heuristics and simpler match finder.

"Chain" is more interesting. Chain is a complex "lazy"-type parser which considers N decisions ahead (eg. Chain 4 considers 4 decisions ahead). It works thusly :

Chain Parse : first do a full parse of the file using some other parser; this provides with a baseline cost to end from each point. Now do a forward parse. At each position, consider all match and literal options. For each option, step ahead by that option and consider all the options at the next position. Add up the cost of each coding step. After N steps (for chain N) add on the cost to end from the first baseline parse. Go back to the original position and finalize the choice with the lowest cost. Basically it's a full graph walk for N steps, then use an estimate of the cost to the end from the final nodes of that sub-graph.

To make Chain parsing viable you have to reduce the number of match options to a maximum of 8 or so. Still Chain N has a complexity of 8^N , so it becomes slow very quickly as N grows.

Chain forward parse is significantly better than LZSS style backwards optimal parse for these LZ coders that have important adaptive state. The baseline parse I use for Chain actually is a backwards LZSS optimal parse, so you can see how it does by looking at the "Chain 0" results.


First overall results. Chain 6 is the most amount of steps I can run in reasonable time, and AStar 2048 means the quantum length for dividing up the file for AStar was 2048.

raw Fast Normal Chain 6 AStar 2048
lzt00 16914 5179 5016 4923 4920
lzt01 200000 198313 198321 198312 198312
lzt02 755121 181109 177792 173220 173315
lzt03 3471552 1746443 1713023 1698949 1690655
lzt04 48649 13088 12412 10407 10249
lzt05 927796 368346 367598 355804 354230
lzt06 563160 352827 351051 344721 343173
lzt07 500000 226533 215996 209133 208566
lzt08 355400 250503 249987 230541 230220
lzt09 786488 302927 287479 268544 265525
lzt10 154624 11508 10958 10307 10291
lzt11 58524 20553 19628 19139 19087
lzt12 164423 29001 26488 23966 23622
lzt13 1041576 935484 931415 924510 922745
lzt14 102400 47690 47298 46417 46350
lzt15 34664 10832 10688 10269 10260
lzt16 21504 10110 10055 9952 9927
lzt17 53161 19526 18514 17971 17970
lzt18 102400 64280 63251 59772 59635
lzt19 768771 322951 288872 269132 269162
lzt20 1179702 888881 872315 856369 855588
lzt21 679936 91677 88011 83529 83184
lzt22 400000 287715 284378 279674 279459
lzt23 1048576 807253 804048 798369 798334
lzt24 3471552 1418076 1411387 1399197 1388105
lzt25 1029744 113085 107882 97320 100175
lzt26 262144 212445 210836 207701 207552
lzt27 857241 237253 235137 222023 220837
lzt28 1591760 332660 308940 260547 252808
lzt29 3953035 1193914 1180823 1147160 1135603
lzt30 100000 100001 100001 100001 100001
10800163 10609600 10337879 10289860


Now number of Chain steps for the chain parser : (that's O0 - O6)

U N O0 O1 O2 O3 O4 O5 O6
lzt00 16914 5016 5024 4922 4922 4922 4922 4923 4923
lzt01 200000 198321 198321 198312 198312 198312 198312 198312 198312
lzt02 755121 177792 177877 175905 174835 174073 173759 173509 173220
lzt03 3471552 1713023 1712337 1704417 1703873 1702651 1701635 1700282 1698949
lzt04 48649 12412 11315 10516 10481 10457 10427 10416 10407
lzt05 927796 367598 368729 365743 364332 360630 356403 355968 355804
lzt06 563160 351051 350995 346856 345500 344778 344739 344702 344721
lzt07 500000 215996 215644 211336 209481 209259 209244 209138 209133
lzt08 355400 249987 249372 239375 237320 231554 231435 233324 230541
lzt09 786488 287479 284875 280683 275679 270721 269754 269107 268544
lzt10 154624 10958 10792 10367 10335 10330 10311 10301 10307
lzt11 58524 19628 19604 19247 19175 19225 19162 19159 19139
lzt12 164423 26488 25644 24217 24177 24094 24108 24011 23966
lzt13 1041576 931415 931415 929713 927841 926162 924515 924513 924510
lzt14 102400 47298 47300 46518 46483 46461 46437 46429 46417
lzt15 34664 10688 10656 10317 10301 10275 10278 10267 10269
lzt16 21504 10055 10053 9960 9966 9959 9952 9948 9952
lzt17 53161 18514 18549 17971 17970 17974 17971 17973 17971
lzt18 102400 63251 63248 59863 59850 59799 59790 59764 59772
lzt19 768771 288872 281959 277661 273316 269157 269141 269133 269132
lzt20 1179702 872315 872022 868088 865376 863236 859727 856408 856369
lzt21 679936 88011 88068 84848 83851 83733 83674 83599 83529
lzt22 400000 284378 284297 281902 279711 279685 279689 279696 279674
lzt23 1048576 804048 804064 802742 801324 799891 798367 798368 798369
lzt24 3471552 1411387 1410226 1404736 1403314 1402345 1401064 1400193 1399197
lzt25 1029744 107882 107414 99839 100154 99710 98552 98132 97320
lzt26 262144 210836 210855 207775 207763 207738 207725 207706 207701
lzt27 857241 235137 236568 233524 228073 223123 222884 222540 222023
lzt28 1591760 308940 295072 286018 276905 273520 269611 264726 260547
lzt29 3953035 1180823 1183407 1180733 1177854 1170944 1162310 1152482 1147160
lzt30 100000 100001 100001 100001 100001 100001 100001 100001 100001
10609600 10585703 10494105 10448475 10404719 10375899 10355030 10337879

Some notes : up to 6 (the most I can run) more chain steps is better - for the sum, but not for all files. In some cases, more steps is worse, which should never really happen, but it's an issue of approximate optimal parsers I'll discuss later. (*)

On most files, going past 4 chain steps helps very little, but on some files it seems to monotonically keep improving. For example lzt29 stands out. Those files are ones that get helped the most by AStar.


Now the effect on quantum size on AStar. In all cases I only output codes from the first 3/4 of each quantum.

raw 256 512 1024 2048 4096 8192 16384
lzt00 16914 4923 4923 4920 4920 4920 4921 4921
lzt01 200000 198312 198312 198312 198312 198312 198314 198314
lzt02 755121 175242 173355 173368 173315 173331 173454 173479
lzt03 3471552 1699795 1691530 1690878 1690655 1690594 1690603 1690617
lzt04 48649 10243 10245 10234 10249 10248 10241 10241
lzt05 927796 357166 354629 354235 354230 354233 354242 354257
lzt06 563160 346663 343202 343139 343173 343194 343263 343238
lzt07 500000 209934 208669 208584 208566 208556 208553 208562
lzt08 355400 228389 229447 229975 230220 230300 230374 230408
lzt09 786488 266571 265564 265487 265525 265559 265542 265527
lzt10 154624 10701 10468 10330 10291 10273 10273 10272
lzt11 58524 19139 19123 19096 19087 19085 19084 19084
lzt12 164423 23712 23654 23616 23622 23628 23630 23627
lzt13 1041576 923258 922853 922747 922745 922753 922751 922753
lzt14 102400 46397 46364 46351 46350 46350 46348 46350
lzt15 34664 10376 10272 10260 10260 10251 10258 10254
lzt16 21504 9944 9931 9926 9927 9927 9927 9927
lzt17 53161 17937 17970 17968 17970 17969 17969 17969
lzt18 102400 59703 59613 59632 59635 59637 59640 59640
lzt19 768771 269213 269151 269128 269162 269193 269218 269229
lzt20 1179702 855992 855580 855478 855588 855671 855685 855707
lzt21 679936 83882 83291 83215 83184 83172 83171 83169
lzt22 400000 279803 279368 279414 279459 279605 279630 279647
lzt23 1048576 798325 798319 798321 798334 798354 798357 798358
lzt24 3471552 1393742 1388636 1388031 1388105 1388317 1388628 1388671
lzt25 1029744 97910 101246 101302 100175 100484 100272 100149
lzt26 262144 207779 207563 207541 207552 207559 207577 207576
lzt27 857241 222229 220832 220770 220837 220773 220756 220757
lzt28 1591760 256404 253257 252933 252808 252737 252735 252699
lzt29 3953035 1136193 1135442 1135543 1135603 1135710 1135689 1135713
lzt30 100000 100001 100001 100001 100001 100001 100001 100001
10319878 10292810 10290735 10289860 10290696 10291106 10291116

The best sum is at 2048, but 1024 is a lot faster and almost the same.

Again, as the previous note at (*), we should really see just improvement with larger quantum sizes, but past 2048 we start seeing it go backwards in some cases.


Lastly a look at where the AStar parse is spending its time. This is for a 1024 quantum.

The x axis here is the log2 of the number of nodes visited to parse a quantum. So, log2=20 means a million nodes were needed to parse that quantum. So for speed purposes a cell one to the right is twice as bad. The values in the cells are the percentage of quanta in the file that needed that number of nodes.

(note : log2=20 means one million nodes were visited to output 768 bytes worth of codes, so it's quite a lot)

log2 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
lzt00 0 0 0 18.18 59.09 18.18 4.55
lzt01 3.75 0.75 41.2 34.08 13.86 5.62 0.75
lzt02 1.81 1.36 25.37 34.09 13.59 13.02 8.15 1.93 0.23 0.23
lzt03 1.46 1.18 17.51 18.46 14.16 13.17 6.95 4.81 3.66 4.81 9.54 2.79 0.96 0.11 0.03
lzt04 1.67 0 0 1.67 0 21.67 5 18.33 3.33 10 16.67 16.67 5
lzt05 0.59 0.25 4.41 10.77 9.92 18.32 13.23 10.09 9.67 6.02 12.47 3.22 0.51 0.08 0.08
lzt06 0.8 0.93 6.81 23.77 14.69 16.96 21.09 11.48 2.67 0.8
lzt07 0.46 0.46 8.66 7.88 6.8 15.3 17 14.53 5.56 9.58 8.19 4.79 0.31 0.31
lzt08 0 0 0 0 0 0 1.68 1.68 1.47 27.67 53.88 11.95 1.68
lzt09 0.29 0.48 0.76 0.86 0.95 3.9 28.07 47.76 16.18 0.38
lzt10 0 0.56 10.17 12.99 9.04 9.04 10.17 41.24 4.52 0.56 1.13
lzt11 0 0 7.89 10.53 14.47 17.11 6.58 9.21 21.05 10.53 2.63
lzt12 0 0 0 0 0 4.27 28.91 59.24 7.58
lzt13 0 0 0.07 0.14 0.57 1.72 3.36 5.72 39.24 42.03 7.08 0.07
lzt14 0 0.83 0 2.5 8.33 34.17 42.5 5 2.5 1.67 0.83 0 0.83
lzt15 0 2.27 4.55 15.91 13.64 15.91 13.64 6.82 11.36 11.36 4.55
lzt16 0 0 3.57 0 14.29 42.86 32.14 3.57
lzt17 1.39 1.39 2.78 1.39 4.17 75 13.89
lzt18 0 0 0 0 0 0.72 0 2.17 2.9 11.59 56.52 23.19 2.9
lzt19 0 0 1.26 2.81 0.39 7.56 87.11 0.87
lzt20 0 0.13 2.08 2.02 4.29 67.07 24.29 0.06
lzt21 0.2 0.78 6.07 6.07 5.28 19.77 35.62 22.9 1.96 0.2 0.2
lzt22 0 0.56 2.98 5.59 26.82 62.94 1.12
lzt23 0 0 0 0 0 0.07 1.35 2.63 0.92 70.88 23.15 0.14 0.36 0.5
lzt24 0.44 0.61 4.14 37.41 7.62 12.68 12.72 8.52 6.11 5.19 3.11 0.94 0.31 0.04
lzt25 0.22 0.43 1.52 1.74 2.68 6.44 15.69 27.19 30.22 13.09 0.72
lzt26 0 0 0 1.15 3.15 2.58 77.65 14.61 0.57
lzt27 0.61 0.1 7.55 6.53 1.22 4.39 5 4.08 7.76 44.8 16.43 1.43
lzt28 0.25 0.1 3.71 0.94 0.74 6.77 15.56 10.08 10.97 14.82 18.68 11.41 4.05 1.24 0.1
lzt29 0.3 0.73 1.61 22.37 5.28 6.16 26.34 2.97 0.48 0.85 19.63 12.47 0.73
lzt30 3.7 0.74 47.41 34.07 12.59 0.74

Well there's no easy answer, the character of the files are all very different.

In many cases the A Star parse is reasonably fast (comparable to Chain 3 or something). But in some cases it's quite slow, eg. lzt04, lzt08, lzt28.


Okay, I think that's all the data. We have one point to discuss :

(*) = in all these type of endeavors, we see these anomolies where as we give the optimizer more space to make decisions, it gets better for a while, then starts getting worse. I saw the same thing, but more extreme, with video coding.

Basically what causes this is that you aren't optimizing for your real final goal. If you were optimizing for the total output size, then giving it more freedom should never hurt. But you aren't. With Chain N or with A Star in both cases you are optimizing just some local portion, and it turns out that if you let it make really aggressive decisions trying to optimize the local bit, that can hurt overall.

A similar issue happens with an Huffman optimal parse, becuase you are using the huffman code lengths from the previous parse to do the current parse. That's fine as long as your parse is reasonably similar, but if you let the optimal parser really go nuts, it can start to get pretty far off those statistics, which makes it wrong, so that more optimizing actually gives worse results.

With video coding the main issue I had was that the optimization was generally local (eg. just on one macro block at a time or some such), but it of course affects the future as a source for motion compensation (and in other ways), and it turns out if you do really aggressive optimization on the local decisions, that can wind up hurting overall.

A similar thing can happen in image and video coding if you let optimization proceed very aggressively, because you have to use some simple analytic criterion (such as RMSE - though even if you use a fancier metric the same problems arise). The issue is that the coder can wind up finding strange states that are a good trade-off for RMSE, but wind up looking just horrible visually.

Obviously the correct solution is to optimize with the true final goal in mind. But that's not always possible, either computationally, or because the final goal is subjective.

Generally the solution is to moderate the optimization in some way. You have some heuristic idea of what kind of solutions will provide good globally optimal solutions. (for example, in image/video coding, you might require that the bit rate allocation not create too big of a difference between adjacent blocks). So you sort of want to guide your optimization to start around where you suspect the answer to be, and then you tune it so that you don't allow it to be too aggressive in making whatever decision it thinks is locally optimal.

1/07/2012

01-07-12 - Protectionism

There are some basic economics that I just don't understand, and a lot of the times the accepted "right answer" conflicts with common sense.

One example is it seems to me that buying something locally made is better for the local area than buying something made far away. (for "locality" you may substitute "state" or "nation" or whatever region you want to divide things by).

For me this conjures bad memories of the anti-"Jap" "USA USA" crowd of the '80's that had "buy American" bumper stickers and such ; something in my moral fibers says you should buy the best quality cheapest product. But I don't think that's true.

Consider for the moment the case that there are two products, identical in price and quality. One is locally made, one is foreign made. I contend it is better for the local area (and usually better for you personally) to buy the locally made product. When you do that, the money goes to someone who lives nearby, who spends that money again, and that person spends it, again, etc. This makes the local area prosperous.

(in a purely selfish sense, whether or not making the local area prosperous is good for you or not depends on the details of your situation; if you are a merchant or an altruist, it is good for you; but if your business is international and you would prefer local property values to be low, it might be bad for you; we will assume for the moment that you want the local area to benefit).

So there is some value to buying local and keeping money and industry circulating locally.

So, even if the local product is somewhat more expensive, it still might be better overall if you bought that instead of the foreign product. You have to weigh the benefit of both; the region gains some utility from access to cheaper foreign products, but that is traded off against not circulating that money around the local economy. eg. there's some break even point (in terms of overall utility) ; maybe if the local product costs 20% more, that's the actual break even point.

Of course consumers should not have to make that decision themselves, they should just be able to buy the cheapest product. The correct way to fix that is with government - one of the valuable things that government can do is to make apparent price equal to actual price (eg. to make price proportional to utility, or to move long term costs forward, etc.), or to use laws to bias pricing so that logical purchasing decisions lead to the greatest overall utility (eg. putting penalties on products that help you but hurt others).

The obvious way to make the prices match utility here is either with tarrifs on imports, or subsidies for local production. This is called "protectionism" to attack it, but it seems to me it's just a way of getting the benefit of circulating those dollars locally.

I'm a little disturbed by my conclusion because it's awfully close to the anti-globalization crackpots who claim that modern government financial policy benefits "wall street not main street" (and other slogans).


Granted, in reality, it's too late to go back to pre-1990's protectionism. The cat is out of the bag. And of course in reality protectionism degrades into political gifts for corrupt corporations. But we can ignore those issues for the theoretical discussion.

Also, if you are an extremely altruistic chap you might question the whole goal of maximizing the benefit to your locality (nation/state/fiefdom/whatever). You might say the goal of policies should be to maximize the good for the world. But for the moment let's ignore that and assume that the government of a nation should act to maximize benefit for that nation.

1/06/2012

01-06-12 - Surveying is a Powder Keg

So I got my property surveyed a while ago, because there were some boundaries I wasn't totally sure about, and wanted to see how much space I had for fences, etc.

I should have realized this but did not - surveying is a powder keg. When the surveyor comes out and puts out his stakes and flags, it's like a siren call for neighborhood crazies to come around and dispute the line.

Basically people are retarded and unreasonable, and before just calmly talking to you, they assume you will ask them to take down their fence which is across the line (or whatever). Depending on where you live, the exact force of crazy takes different forms; out in the country you might get shot at; over in the suburbs you might get served with a notice of adverse posession. All just because you hired a surveyor, before you even consider doing anything about it. The crazies seem to see the surveyor's flag as a declaration of war.

The mistake I made was I got the survey and then I wanted some time to think about how I was going to fence things, I didn't just jam up the fences right away. That gave the crazies time, and this is what they did :

(pink flag stake is official from the surveyor of course, and crazy neighbor has put up his own line two feet into my property from the official mark)

Well done, crazy neighbor.

A bit of forehead vein popping and yelling at them got them to remove the post, but I expect more complications of this issue before it is done.

(BTW yelling at people is never satisfying in real life the way it is in fiction. In fiction there's this myth that people are actually good and reasonable, and were just doing something bad for a moment, and when you yell at them they realize their mistake and reform; like if Gorden Ramsay walked in at the right moment and yelled at Hitler he would be like "oh gosh, you're right, I'm so ashamed, I'll try to do better". Furthermore in fiction, they respond to the yelling either by yelling back, giving a satisfying argument, or by accepting the scolding and apologizing. In real life that never happens, what really happens is they try to change the subject, or turn it around and somehow blame you, or make excuses, or bring up random other points that don't matter to the issue (*), and it just leaves you feeling derailed).

(* = this is maybe the most common and most effective response - the completely random diversion into some other story that just leaves me going "WTF?" and totally takes the wind out of my sails. The other super effective tactic that I've noticed from like car salesman and contractors and such is to just completely stone wall you, like they tack on some $500 extra fee that they specifically agreed they wouldn't do, and you're like "hey WTF is this fee that you said wouldn't be there" and they just act like they did nothing wrong and of course you're going to go along with them, like "yep that's the $500 extra rapeage surcharge, so you can pay now by giving me your bank account and social security number..." , wait, what? I'm in the middle of yelling at you, you can't just act like your way is the only way).

01-06-12 - Nice Wiring Bub

I took off a horrible track light (blyeck, so tacky, track lights and can lights are the worst), and underneath I found this gem :

at first I thought it was just a big wad of masking tape on the end of the wire (bad enough, not an actual insulator, and a fire hazard), but upon peeling the masking tape I found this :

Oh, of course. You spliced on an extra 1 inch of wire, no wire nut, wrapped in masking tape - and the thing that most boggles my mind is that the wire ends are not even twisted in any kind of sane way, they are just randomly balled up around each other. Not to mention the original wires are plenty long without the splice.

Pretty impressive piece of fail.

BTW one of the hazards of old knob & tube wires is that the insulation is only rated to 60 C, but newer light fixtures are allowed to heat up to 90 C (which new Romex can handle). So you need to be careful when installing new light fixtures, and at the very least don't over-bulb (*). One way to solve this (without a ton of rewiring) is to back up the knob and tube a few feet away, put a junction box there, then run new Romex for the last few feet.

(* = I just love to over-bulb ; I can never get enough light; I used to put 100 W's in everything whenever I moved into an apartment. Here I might be a bit more careful about that, because they do generate a lot more heat (BTW I despise the gross inhumane light of CFL's, but one advantage of them with old wiring is they draw much less power (which keeps the wires cool) and they themselves are cool (which doesn't heat up the light fixture and box)). I'm not a huge fan of dimmers (fucking 75 W is already dark enough, I don't need any less than that), but if I could install an *amplifier* that let me over-drive the bulbs that would be sweet (but not in my house, which is apparently wired by paper clips and masking tape)).

It's kind of scary what kinds of disasters can be hiding inside your walls that you don't know about upon purchase (or maybe ever, until they cause water damage or a fire or whatever). I really like doing home improvement work in the garage and the basement, because the walls are unfinished so I can see where the studs are, which is so handy, I can see all the wire runs and junction boxes. It's totally superior.

Covering up your walls is super over-rated. I think if I was designing a modern house it would be all Pompidou Center style with color-coded pipes running around where I could directly access the electricity, water, etc.

If you want a more old fashioned home look, you could still do all your major wire runs around the ceiling and then cover them with a removeable wood crown molding piece. That way if you want to get into the wires, you just pop off the crown molding and you have a wooden box for access.

1/04/2012

01-04-12 - Two laws you should hate

NDAA : makes legal the GWB/Obama policy of indefinite detainment (outside of war zone, Geneva convention, or any legal jurisdiction), and unilateral assasination orders -

� Obama�s Signing Statement on NDAA I have the power to detain Americans� but I won�t Alex Jones' Infowars There's a war on
The NDAA's historic assault on American liberty Jonathan Turley Comment is free guardian.co.uk
The Hit List The Public Applauds As President Obama Kills Two Citizens As A Presidential Prerogative � JONATHAN TURLEY
Senate Votes Overwhelmingly To Allow Indefinite Detention of Citizens � JONATHAN TURLEY

SOPA : basically makes free speech on the internet impossible, by making site hosts legally liable for any content posted on them. Basically allowed private companies to censor the internet at will.

Stop Online Piracy Act - Wikipedia, the free encyclopedia
SOPA for Dummies - Google Docs
House takes Senate's bad Internet censorship bill, tries making it worse

SOPA might not pass in it's worst form, but some lobbyists are pushing very hard for something like this, so the internet is going to get censored unless we fight it very hard.

01-04-12 - Police Brutality

Much has been made of the outrageous treatment of Occupy protestors by police. But I believe it's been small potatos compared to the rampant, systemic brutality which pervades our nation's police departments. It's fallen out of the news because we're bored of it (we've seen black guys getting beaten by cops a million times) and because it doesn't affect the wealthy, but it has really not gotten better (or not enough, anyway).

Police in American are de-facto above the law. They violate human rights at will, with rarely a punishment greater than suspension or transfer.

Here in Seattle things have gotten shockingly bad, so bad in fact that even the DOJ has made an official report about how bad our police department is. (original here) . What is Seattle's official response? Not to do anything about it, it's to question the methodology of the report. Shameful.

To Protect & Skull-Fuck - Page 1 - News - Seattle - Seattle Weekly
SPDecay - Page 1 - News - Seattle - Seattle Weekly
Seattle sues attorney over public records request Local & Regional Seattle News, Weather, Sports, Breaking News KOMO News
Seattle Police Department Sued by KOMO News for Not Releasing Dash-Cam Videos - Seattle News - The Daily Weekly
Seattle Police A Department in Denial - Page 1 - News - Seattle - Seattle Weekly

One of the few ways that people are getting any justice these days is by getting the dash cam footage to prove that the cops' lies are in fact lies. (see for example Ian Birk lying about John T. Williams "lunging" at him (eerily similar to the very tragic case of Otto Zehm in Spokane in which officers also lied and claimed he "lunged" (with a soda pop bottle, which led them to kill him))). The result of course is that SPD is doing what it can to stop the dash cam system. They are now suing to stop releases of footage under public records disclosures, and have "accidentally" deleted many thousands of hours of footage.

Police Chief Diaz needs to be fired.

But in the larger picture, the big problem is the stone wall and loyalty attitude of police departments; that when there is a case where a police office may have killed a civilian without cause, their attitude is not to investigate and apologize, it's to cover it, draw ranks, support the officer, etc. This attitude makes not just the few bad cops responsible, but every cop who treats his compatriots as beyond reproach or above the law. Loyalty to evil is not admirable. (ask Joe Paterno).

Following a rash of unjustified killings in the 80's, many laws were passed that make it somewhat more difficult for police officers to use their guns. But the gap has been filled by stompings, clubbings, and taserings.

spokane police abuses summary

previous post at cbloomrants

Of course part of the problem is that there is a decent portion of the population that thinks "tough policing" is a good thing.

01-04-12 - Double Pane Glass is a scam

"Replacement Windows" are shit sold by the window industry to sucker homeowners. The tiny gap (typically less than 1 inch) in a standard double pane IGU (integrate glass unit) is no better (and sometimes worse) than a traditional window + storm.

Throwing out perfectly good lovely old windows for "environmental" reasons is of course retarded; if you want more air proofing and don't already have storm windows, just get some good storms and you're done.

Replacement windows almost always uglier than good old wood windows, which architecturally fit the house and have nice wavy old glass.

Furthermore, they cannot be maintained and repaired in the same way as an old window + storm. When an IGU fails (which they do in 10-20 years typically) it cannot be repaired, it has to be replaced. Old wood windows can be easily taken a part, cleaned, resealed, and can last 100 years. (Vinyl windows and caulk and foam seal strips and so on are similarly problematic - they seem great at first, but they all decay badly in sun and weather, so have to be replaced regularly and can't really be maintained).

Replacement windows are usually shoved inside the existing window framing, and add their own thin frame which makes the opening smaller and adds an extra ugly architectural detail.

It's a standard "sustainable" bullshit corporate ripoff. To sell you some new crap and get you to throw away your perfectly good old windows. (it's like the wonderful irony of "sustainable christmas trees").

This guy addresses the issue in much more detail.

12/20/2011

12-20-11 - Grocery Store Lines

Grocery store lines are a microcosm for how fucking shitty almost every human is, in almost every way.

First of all, you have the fact that nobody shows any basic human courtesy to each other. I almost never see someone with a ton of items let ahead the person with one item. But of course when I let the person with one item go ahead of me, they invariably do something fucked up like ask for a pack of cigarettes (which always takes forever in US grocery stores) or pay with food stamps or some shit. (aside : why is it always such a clusterfuck to pay with food stamps in some groceries? they must have done it a million times, but the checker always acts like someone handed them monopoly money, and the manager has to get called, and forms are filled out, wtf). Of course the people who are paying with coupons and civil war scrip never warn the person lining up behind them that maybe they should pick a different line.

But when the lines get long you really start to see people's souls.

There's the people who stand around and chat right in the middle of the lines. I watch people over and over asking "are you in line? oh, no? okay". Hmm, maybe you should get the fuck out of the line area to have your chit chat!

Then there's the people who can't seem to run a line in a reasonable direction and wind up blocking all the aisles or running it into another line. Invariably it takes a manager to come over and tell people to "please line up over here" since god knows they aren't going to sort themselves out.

Then you get the people who start stamping around and huffing and quickly looking from one side to another like this is the greatest injustice since slavery. You can just see wheels spinning in their heads about how "ridiculous this is" and so on.

There are the people who think that being really pushy in the back of the line is going to speed things up. We're eight people away from the register and they keep jamming their cart into my feet because the person three ahead of us moved. I get out of line to grab a magazine (leaving my cart) and they push into the gap where my body was. Whoah, slow down dick-face, we're still twenty feet from the register, you can chill a little now.

On the flip side then is the people who are absurdly slow about getting their checkout done with. (and of course to double-down on dickishness, it's often the same people who were impatient and pushy when they were way back in line (*)).

There are two general classes of people who fail to check out quickly :

1. The epically incompetent. These people pay with a check, either because they are ancient geezers (excusable) or because they think cards are somehow inferior or checks are more convenient (inexecusable). They might go digging around in their purse for half an hour trying to find exact change, or somehow still don't know they can scan their card before the checker is done.

2. The intentionally slow. These people think everyone needs to chill and slow down; what's the rush? They might chat with the checker a bit. They think everyone else is rotten for being in such a hurry. OMG you epic douchebag; it's fine if you want to live a slow, relaxed, life, but it's not okay to impose that on everyone behind you in line. You probably drive slowly too and think that everyone behind you is in the wrong for wanting to go faster. You probably have a "keep your laws off my body" bumper sticker, and fail to see that your own behavior is the same kind of selfish forcing of your values on others.

(* = the double-dick seems to be the norm for airplane passengers; who are inevitably and annoying and pushy and do a lot of huffing when you are way back in line, but when they actually get up to the TSA guy they still have their shoes on, don't have their id in hand, are drinking a bunch of liquid, and act like it's some big surprise. Same thing with the overhead bin stowage of course).

12/19/2011

12-19-11 - SRAM

SRAM is rolling out a big promotional campaign this year, trying to convince people that their components are actually superior.

They've signed up lots of the pro teams. Just as with car racing, do not be mislead by what the pros use. I see lots of morons on forums saying "well the race team uses this, it must be great". No, the reason the race team uses it is because they are paid to use it.

SRAM double-tap shifters are fucking *awful*. Absolutely retarded. Imagine your right mouse button was taken away and instead you had to double-tap the left to accomplish that function. Yep, it's horrible.

Double-press GUI is always horrible and always should only be used as a last resort. We use it sometimes in games because there just aren't enough buttons on console controllers, but a smart game designer knows that only the secondary functions should go on double-tap buttons (the same goes for "hold" buttons) and the twitch functions should be on their own dedicated button.

Actually it's even worse than that. They don't do the right thing when you're at the edge of the gear shift limits. So like if you are at the low end, you can't go any lower (which you would accomplish by double-tapping), it will still let you single tap (to up-shift). So you're riding up a steep hill and you want a lower gear, you go to double-tap, and oh fuck half way through it the lever won't let you do the double-tap, but you've already single-tapped. There's no way to back out of it, when you let go it will up-shift you and you'll be fucked.

The STI system is just one million billion times better. But it's patented. That's why all these shift levers are so dang expensive, because they're patented.

It's also why there has to be a new lever system every year, a new size of bottom bracket, a new headset system - it's so that the manufacturer can patent it and/or make an exclusive line so that they can rip you off. The old system was perfectly fine functionally, the problem with it was that generic brands were starting to come out with cheap decent components for that system. We can't have that.

It's just like medicine of course, though with medicine it's much more diabolical.

Certainly with medicine it's obvious that there should be laws that prevent the pointless pushing of the new expensive product when it's not actually any better than cheap old solutions.

But I think it would actually be in the world's best interest to have a similar law for everything. It would be hard to phrase and hard to enforce, but the idea is something like - you must make components that are compatible with others on the market unless the incompatibility is for a necessary functional reason. It's actually much better for the free market and competition of products can plug and play and the consumer can choose based on pice and functionality, not compatibilty with some bullshit proprietary interface.

One that annoys me is car parts; most of the car parts for a Porsche or BMW or whatever are actually identical to the ones for a cheaper car (like a VW for example) - but they intentionally make the interface ever so slightly different so that you can't just go buy the cheaper part. The parts are all made by Bosch or whoever major part supplier anyway, it's not like you get a better brand of part for the money. The interesting thing to me is that the car maker doesn't really benefit from this, it's the part maker who does, so there must be some kind of collusion where the car maker gets a kickback in exchange for using the proprietary part.

Maybe the most obvious example is car wheels. Wheels are wheels, there's no need for them to be car specific, but the auto manufacturers intentionally use different bolt spacings (5x130, 4x110, etc) so that you can't go buy cheap mass market wheels for your fancy car. You can cross-shop the exact same wheel with different bolt spacings, and the price difference can be 2X or more.

12/17/2011

12-17-11 - LZ Optimal Parse with A Star Part 4

Continuing ...
Part 1
Part 2
Part 3

So we have our A star parse from last time.

First of all, when we "early out" we still actually fill out that hash_node. That is, you pop a certain "arrival", then you evaluate the early out conditions and decide this arrival is not worth pursuing. You need to make a hash_node and mark it as a dead end, so that when you pop earlier arrivals that see this node, they won't try to visit it again.

One option would be to use a separate hash of just bools that mark dead ends. This could be a super-efficient smaller hash table of bit flags or bloom filters or something, which would save memory and perhaps speed.

I didn't do this because you can get some win from considering parses that have been "early outed". What you do is when you decide to "early out" an arrival, you will not walk to any future nodes that are not yet done, but you *will* consider paths that go to nodes that were already there. In pseudo-code :


pop an arrival

check arrival early outs and just set a flag

for all coding choices at current pos
{
  find next_node
  if next_node exists
    compute cost to end
  else
    if ! early out flag
       push next_node on arrivals stack
}

So the early out stops you from creating any new nodes in the graph walk that you wouldn't have visited anyway, but you can still find new connections through that graph. What this lets you do in practice is drive the early out thresholds tighter.

The other subtlety is that it helps a lot to actually have two (or more) stages of early out. Rather than just stop consider all exit coding choices once you don't like your arrival, you have a couple of levels. If your arrival looks sort of bad but not terrible, then you still consider some of the coding choices. Instead of considering 8 or 16 coding choices, you reduce it to 2 or 4 which you believe are likely advantageous.

The exact details depend on the structure of your back end coder, but some examples of "likely advantangeous" coding choices that you would consider in the intermediate early out case : if you have a "repeat recent offset" structure like LZX/LZMA, then those are obvious things to include in the "likely advantageous". Another one might be RLE or continue-previous type of match codes. Another would be if the literal codes below a certain number of bits with the current statistics. Also the longest match if it's longer than a certain amount.

Okay, so our A star is working now, but we have a problem. We're still just not getting enough early outs, and if you ran this on a big file it will take forever (sometimes).

The solution is to use another aspect we expect from our LZ back end, which is "semi-locality". Locality means that a decision we make now will not have a huge effect way in the future. Yes, it has some effect, because it may change the state and that affects the future, but over time the state changes so many times and adapts to future coding that the decision 4000 bytes ago doesn't matter all that much.

Another key point is that the bad (slow) case occurs when there are lots of parses that cost approximately the same. Because of our early out structure, if there is a really good cheap parse we will generally converge towards it, and then the other choices will be more expensive and they will early out and we won't consider too many paths. We only get into bad degeneracy if there are lots of parses with similar cost. And the thing is, in that case we really don't care which one we pick. So when we find an area of the file that has a huge branching factor that's hard to make a decision about, we are imperfect but it doesn't cost us much overall.

The result is that we can cut up the file to make the parse space tractable. What I do is work in "quanta". You take the current chunk of the file as your quantum and parse it as if it was its own little file. The parse at the beginning of the quantum will be mostly unaffected by the quantum cut, but the parse at the end will be highly affected by the false EOF, so you just throw it out. That is, advance through the first 50% or 75% of the parse, and then start the next quantum there.

There is one special case for the quantum cutting which is long matches that extend past the end of the quantum. What you would see is when outputting the first 50% of the parse, the last code will be a match that goes to the end of the quantum. Instead I just output the full length of the match. This is not ideal but the loss is negligible.

For speed you can go even further and use adaptive quantum lens. On highly degenerate parts of the file, there may be a huge node space to parse that doesn't get early-out'ed. When you detect one of these, you can just reduce the quantum len for that part of the file. eg. you start with a quantum length of 4096 ; if as you are parsing that quantum you find that the hash table occupancy is beyond some threshold (like 1 million nodes for example), you decide the branching factor is to great and reduce the quantum length to 2048 and resume parsing on just the beginning of that chunk. You might hit 1 million nodes again, then you reduce to 1024, etc.

That's it! Probably a followup post with some results numbers and maybe some more notes about subtle issues. I could also do several long posts about ideas I tried that didn't work which I think are sort of interesting.

old rants