2/01/2010

02-01-10 - Google Autocorrect

One of the fucking annoying things Google has silently added recently is auto-correcting searches. For example, try any of these searches :

prosac

forman

oper labs

and what you will get is searches that have results for :

prozac

foreman

opera labs

note that I'm not talking about the "did you mean" - I mean it silently decides to give you results for a different word. Awful.

Fortunately there is a trick to get around this. Just put your search word in quotes. Apparently you could also use a +, though that also makes the word mandatory.

See for example : here

It seems to me at least part of the problem is that the stemming is not aware of changing word meaning. eg. obviously if I search "widget" then results that contain "widgets" are probably good, but changing "prosac" to "prozac" is giving me completely unrelated results. You should only automatically broaden the search to word variants that are actually related.

3 comments:

Autodidactic Asphyxiation said...

A better suggestion might be to suppress certain "stemmings" (an unfortunately overloaded term thanks to the unwashed) using -prozac.

"prosac" gives me some music, and then a bunch of places that has "prozac" misspelled "prosac." Maybe Google should try to be more editorial there, and distinguish Prozac-misspellers, but doesn't seem like it would necessarily be useful.

"forman" did the right thing for me. I didn't even get "did you mean..."

What is "oper labs" anyway, besides a canonical example to demonstrate autocorrect?

Anyway, I'm 99% sure that using your suggestion of less-zealous "stemming" would cost Google a lot of money.

cbloom said...

It really only becomes clear how wrong it is when you add more words.

"prosac" is the most obvious case.

Even if you search something like

"prosac ransac"
"prosac image match"
"prosac sample consensus"

it keeps insisting on giving me prozac results even though it should be pretty damn clear from those searches that I actually mean prosac.

If you just search "prosac" then sure it's more reasonable to assume you made a mistake.

"forman" interestingly has the opposite problem; if you just type "forman" it gives you okay results, but the more words I add to it, the more it wants to give me "foreman" results instead.

I don't have any problem with Google doing this by default. But at least show me a button saying that you did it and let me turn it off. Something like "we have also shown results for prozac".

cbloom said...

Wow, I just found a really weird one of these.

If you search for "Synpower" , Google will give you hits that contain "Synergy" (and highlight the occurances of Synergy)

old rants