Tuesday, November 1, 2011

Innovating the in the search space? is the problem solved?

Search is such a fascinating world. After 15 years for me in this space, there is always the temptation of saying "The problem is solved" or "Already saw that!".  Are you sure?  These days I am asking to myself: "is there a way to innovate and disrupt the space with something completely new?" I have some ideas about that, but before going there let me start from the basics, and show that there is a user need that is currently not satisfied while it should be.

Let's suppose that Bob is searching for the query "Cinema". So, what the search engines are doing nowadays? Well we have a bunch of static ranking functions based on links we found on the web graph, and a bunch of dynamic machine learning techniques which try to improve the experience by leveraging aggregated past users' queries, and we have a bunch of personalization techniques based on queries that Bob submitted in the past. So far so good. Is this enough? Simple answer NO

Why? Because it would be absolutely disruptive to know that Bob is actually 24 years old, that he lives in Baltimore, that he just recently checked in this particular place down town Baltimore, that he has friends that recently saw these films and they liked, that Bob himself read recently this book, that he loves these songs, and so on and so forth. Probably, you are starting to follow me. If not just let me explain myself. 

The fundamental assumption that we are making in search is that a query is a good proxy for representing the user's needs. But that's non necessarily always a correct assumption. Actually, It's the user himself that is a good proxy for the representing the user's need. The user and all the structured data she generates during all her life. Nothing more nothing less. If I know that you are currently in this place, and that you have read (and liked) three weeks ago a book written by Ken Bruen  (say the 2001 fictional crime novel), I would probably return the geographically closest cinema where they are currently representing the film London boulevards, because that's big screen adaptation of the book. You are 24 years old and a lot of users have liked that film in this demographic band.

Probably, you understood the impact of this approach by now. The query "cinema" itself it's not bringing enough context to provide a good experience for that specific user, because the search engine doesn't know that specific user enough. It's the user himself that is bringing this contextual information and it would be great to leverage it.

Now, two observations are in order.

First, current personalization techniques are again making the fundamental assumption that queries are a good proxy for the user. So, they enrich the current query with a context given by queries submitted in the past. This is a good direction to pursue, but it is not as rich to mine as the context that the user can bring with himself due to his past social activities.

Second, a critique that you have with personalization is that you actually don't need it because it's very inexpensive to refine a query if you are not satisfied by the current results. I don't buy this argument because the fundamental goal of a search engine is to help users, and you don't want to submit many queries for getting the desired results. This is particularly true if you are using a mobile device where every single character you type has an higher cost than in the PC environment.

So let's say it again: it's all about the user and her activities.  Those two are the elements of disruption in the next age search space. This is what I think. Now, let's repeat the above experiment and let me introduce Alice. Alice just took her mobile phone and started to write "Gifts". 

You are a next generation search engine. What would be the natural thing to do in order to offer her the best user experience?


  1. This is an evolution of my thoughts contained here


  2. I think you're defining personalized search narrowly specifically to be able to reject it as a solution. In fact, what you are talking about, using all information we know about a user to satisfy their information need, is personalized search. And, yes, I agree, this is the future of search.

  3. It's 1.30am and I can't sleep, so I'm just going to spew out a bunch of thoughts and see if it puts my mind to rest, however garbled these comments may be.

    Anyway I see three distinctions in how people search:

    One is the simple use of a search engine as a gateway, which I guess is around 70% of all queries to the major search engines: When someone types in "ebay", or "amazon", or "imdb", just to click on the first link - I almost feel search engines might as well skip the 10 blue links approach and deliver the user directly to their result for such generic queries, something which Firefox delivers very well when simply using the address bar.

    Second are your long tail queries (not the corrent use of the word but it's the one my mind always uses). Very specific queries which require a webpage to exist with indepth knowledge, which is where search engines operate best, and will probably become more important over time (to pluck an example query out of the air: "Is it possible to catch a cold by shaking hands with someone?").

    However I feel there's a middle ground which search engines do not deliver so well: When a user starts generically but he/she does not know their intent, or are just info-gathering on a subject. I feel 10-blue-links, and the simplified and quite hidden option of "Related Searches", do not succeed.

    I feel search engines should have an algorithmic skill in knowing the item searched for, and offer good context information.

    To explain better with two examples: I'm picking the 2009 "Sherlock Holmes film", and seperately the "Nokia 900" WP7 phone.

    Now, if I want reviews for the Sherlock Holmes film, then for me, in desktop mode, the simplest way to my info is "Sherlock Holmes reviews". If I want buy the Nokia phone, it is simpler to search for "buy Nokia 900". Or again, for reviews, "Nokia 900 reviews".

    However, if I enter the terms very generically ("Sherlock Holmes film" / "Nokia 900"), I will just get a random selection of 10-blue-links, including probably a Wikipedia link, an IMDB link amongst others. This probides quite a lottery in search relevance.

    What would be better for me would be to see a context-map of good refinements, based on the search engine having deep knowledge of the product or item I have searched for. In this instance it would be great to see a search engine offer some context for me around my queries, similar to Related Searches but a lot more visual, visible and intuitive. So if I search for "Nokia 900", without specifying my intent ("buy" / "reviews"), it would be great to be able to navigate easily between different contexts so I can assimilate as much data as possible. For instance seeing a spider diagram of different contexts per product, and being able to browse back and forth from "reviews" to "buy" to "history" to "comparisons" and so forth. A search engine should be able to know the contexts for each item and adjust accordingly, and allow me to scan between them at will so that I can data-gather accordingly. For me, this is a real letdown of current search engines.

    On a mobile this becomes more relevant, as I want to save my typing when possible, and save on refinements.

    Anyway, in a few years when OS interfaces becomes more simplified (e.g. you use specific apps for specific tasks in the way that phones and tablets succeed), the power and intelligence of search engines become more critical, as they will be judged on how easy they are to get to your answer, and the simpler we make it for the user will make a big difference in search engine's perceived usefulness.

    I've run out of steam but remind me before I leave to describe my "Grand Theory of Google" to you, because I feel it is a real threat to competitors when it arrives in two or three years.

  4. And you can obviously extend these ideas into the ads space as well.