This turns out to be a nice graph problem, where you need to efficiently compute a prefix search on the huge social graph, with 4x10^8 nodes and 7.7x10^10 edges in FoF and 10^9 nodes and 10^10 edges in OoF.
Four options have been explored:
- a classical trie which is very expensive for pointer memory consumption;
- a sorted vector searched with range queries and binary search which doesn't scale and it is expensive in terms of dynamic insertion;
- brute force search;
- a version bloom filtered architecture, which optimizes brute force search;
LinkedIn has been solving a similar real-time personalized relevance ranking problem for a few years. They order search results by distance in your social graph.
ReplyDelete