Thursday, February 3, 2011

Google, Bing, and web browsing data

Not fair to talk , but decided to mention this interesting posting by the way of Greg. It is also quotede below. Thanks Greg was my pleasure to quote you.

I suppose I should comment, as everyone else on the planet has, on Google's claim that Bing is copying their results.

My reaction is mostly one of surprise. I am surprised that Google wants this issue discussed in the press. I am surprised that Google wants this aired in the court of public opinion.

Google is trying to draw a line on what use of behavior data is acceptable. Google clearly thinks they are on the right side of that line, and I do too, but I'm not sure the average searcher would agree. And that is why Google is playing a dangerous game here, one that could backfire on them badly.

Let's take a look at what Google Fellow Amit Singhal said:
This experiment confirms our suspicion that Bing is using some combination of:
  • Internet Explorer 8, which can send data to Microsoft via its Suggested Sites feature
  • the Bing Toolbar, which can send data via Microsoft’s Customer Experience Improvement Program
or possibly some other means to send data to Bing on what people search for on Google and the Google search results they click.
Of course, what Amit does not mention here is that the widely installed Google Toolbar and the fairly popular Google Chrome web browser send very similar data back to Google, data about every page someone visits and every click they make. Moreover, Google tracks almost every web search and every click after a web search made by web users around the world, since almost every web search is done on Google.

By raising this issue, Google very publicly is trying to draw a particular line on how toolbar and web browsing data should be used, and that may be a dangerous thing for Google to do. The average searcher, for example, may want that line drawn somewhere other than where Google might expect it to be drawn -- they may want it drawn at not using any toolbar/Chrome data for any purposes, or even not using any kind of behavior data at all -- and, if that line is drawn somewhere other than where Google wants it, Google could be hurt badly. That is why I am surprised that Google is coming out so strong here.

As for the particular issue of whether this is copying or not, I don't have much to say on that, but I think the most thought-provoking piece I have seen related to that question is John Langford's post, "User preferences for search engines". John argues that searchers own their browsing behavior and can reveal what they do across the web to whoever they want to. Whether you agree or not with that, it is worth reading John's thoughts on it and considering what you think might be the alternative.


  1. You are completely right mate, but the point is not in which way the search engine are collecting data, but how they are feeding their index!!

  2. I think Google's point is not what data is sent back to Microsoft. It is that if they try very hard to discover that a url U is very relevant to the query q (using sophisticated IR methods), Microsoft can discover the above relationship independently of any IR method and just by tracking how users click on various searches. In some ways both Microsoft and Google are right. The very famous page rank algorithm itself allows Google to use data that they did not create. Nothing wrong if Microsoft does it too. Note that Microsoft will suffer if the quality of Google's results decline. So Microsoft is not exactly using Google's results. They are using millions of users' filtering of Google's results. Just like page rank which uses millions of users' recommendations of various pages. All I can say is "you are right Google but sorry".