Antonio Gulli's coding playground: Got assigned a new patent: how to bootstrap a learning to rank system with real traffic information

Tuesday, October 25, 2011

Got assigned a new patent: how to bootstrap a learning to rank system with real traffic information

Just got assigned a patent filed back in 2005, "Sampling internet user traffic to improve search results". The problem we were trying to address was about bootstrapping a learning to rank system in absence of user search information. This is a typical problem you have when you are not the incumbent search engine, and you don't have already accumulated usage information and user behaviour activities (see more information here Calculating Search Rankings with User Web Traffic Data). How can you compensate this bootstrap situation?

The key intuition was to use web traffic information collected by the way web proxies and observing user traffic and navigational information, including traffic performed querying other search engines. A similar traffic can be observed by minging a collection of web logs for the HTTP_Referer tag, The methodology was used for improving the freshness, the coverage, the ranking and the clustering of search engine results and, more generically, may include monitoring web traffic on remote web servers on the communications network

1 comment:

UnknownOctober 28, 2011 at 9:17 PM
Congrats. It was in 2005, now we are in 2011. I guess patent office taking long time in the fast pace era? :)
ReplyDelete
Replies

Add comment