Random commentary about Machine Learning, BigData, Spark, Deep Learning, C++, STL, Boost, Perl, Python, Algorithms, Problem Solving and Web Search
I know java so I would use hadoop infrastructure, or something similar for other languages.There are many pages about the problem, like this for example: http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
External sorting (http://en.wikipedia.org/wiki/External_sorting) - if you have time but do not have machines :)
I know java so I would use hadoop infrastructure, or something similar for other languages.
ReplyDeleteThere are many pages about the problem, like this for example: http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
External sorting (http://en.wikipedia.org/wiki/External_sorting) - if you have time but do not have machines :)
ReplyDelete