Web index compression is one of the most interesting research field. I stressed the similarity with traditional bandwidth reduction in the past.
Compressed Web Indexes is a great theoretical result by Yahoo!. They provide a strong bound for delta and gamma encoding. The index size is 1/3 of the document size. Now we have a valid justification for this emphirical observation. Another nice side observation is that terms distribution follows a Double Pareto Law and not just a Zipf law.
Great results on Delta encoding, mr. Yahoo!.
No comments:
Post a Comment