Saturday, February 5, 2011

Find a good sample of a query log

You have a stream of queries (infinite, meaning no way to hold all of it in memory). The stream follows a power law. Find a good strategy to sample the stream giving "a good" representativeness to both the tail and the head of the power law.

No comments:

Post a Comment