Caching is an important aspect of modern search engine. There are some assumptions that I see in academic papers which might be different in industrial search. In particular:
1) Inverted lists are nowadays stored in memory; they are no longer stored on disks;
2) There is a periodic effect during the 24hours. There are morning queries and late night queries;
3) Geo-localization is very important; What are the benefits and the costs?
4) Verticals may invalidate cache. For instance you need to mix fresh news, with some cached results. how do you deal with that? What are the benefits and the costs?
The paper Improved Techniques for Result Caching in Web Search Engines models web search caching as a weighted problem. Queries results have the same size, but they may have different benefits. In this sense, the caching problem is not just the problem of maximizing the hit ratio. It becames the problem of maxizing the benefits.
I would like to see more papers where the above 4 considerations are taken into account.