Did Google just crowdsource webspam removal?

March 10, 2011, 11:44 PM UTC

Google could use the new personal search blocks to better its search algorithms globally.

Google (GOOG) today announced that it would allow all users to block certain sites that that they didn’t want to appear in their search results.  This is a great tool for eliminating bad search results.

You’ve probably had the experience where you’ve clicked a result and it wasn’t quite what you were looking for. Many times you’ll head right back to Google. Perhaps the result just wasn’t quite right, but sometimes you may dislike the site in general, whether it’s offensive, pornographic or of generally low quality. For times like these, you’ll start seeing a new option to block particular domains from your future search results. Now when you click a result and then return to Google, you’ll find a new link next to “Cached” that reads “Block all example.com results.” Once you click the link to “Block all example.com results” you’ll get a confirmation message, as well as the option to undo your choice. You’ll see the link whether or not you’re signed in, but the domains you block are connected with your Google Account, so you’ll need to sign in before you can confirm a block.

What was a Chrome extension last month is now available to all Google users.

I’ve been doing this for weeks with a Chrome extension that Google provided last month.  It definitely does clean up my search results.  For instance, there are a lot of sites that scrape my posts here at Fortune or elsewhere.  When I’m looking to go back and reference a previous post and need to find it in a Google search, sometimes scraped content shows up above Fortune (that’s another story) results.  Now, I can kill the scrapers once and for all…at least until the authors dream up more domains to use as scrapers.  It is a cat and mouse game – now Google has the upper hand.

But, that’s just half of the equation.

Google could, and also likely will, use the blockage data from its users to provide to better search results.  They’ve already passively used the data they’ve gathered from the Chrome extension in their recent Content Farm algorithm update.  At the time, Matt Cutts, Google WebSpam director said:

It’s worth noting that this[Content Farm] update does not rely on the feedback we’ve received from the Personal Blocklist Chrome extension, which we launched last week. However, we did compare the Blocklist data we gathered with the sites identified by our algorithm, and we were very pleased that the preferences our users expressed by using the extension are well represented. If you take the top several dozen or so most-blocked domains from the Chrome extension, then this algorithmic change addresses 84% of them, which is strong independent confirmation of the user benefits.

So Google is already checking algorithm improvements against block lists.  That’s just a start.  It would make a lot of sense to expand that into a better search product – even a social one.

For instance, if Google gave me the option of not only blocking my sites, but also blocking sites that my friends don’t like, I’d certainly be interested in checking out the results.  Perhaps just demoting commonly blocked sites in search results would do the trick.

Google could then take the aggregate of what is blocked by certain demographics and push that though to the greater population.  Perhaps just the worst offenders could be black listed at first.

Perhaps. There is also opportunity for abuse with this method and it goes against Google’s ‘Computer Science solutions’ mantra.  If I am a content farmer and I want to get rid of the competition, I send my 1000 fake Google accounts out and block all of my competitors’ content.

However, when it comes down to it, Google wants to rid its results of Spam (I think).  User Blockage data is the perfect tool and Google has just mainstreamed the collection of it.