There is a simple trick for emulating relational join on the top of Map Reduce. Since all the items with the same key are sent to the same reducer, it is therefore easy to emulate a join on the reduce side. What are the other little tricks you need to adopt?
Obviously, you can alway use an higher level abstraction such as Hive on the top of Hadoop
No comments:
Post a Comment