Random commentary about C++, STL, Boost, Perl, Python, Algorithms, Problem Solving and Web Search
The approach that everyone always does? Think up a bunch of features and train a classifier?That's just about every paper on this topic these days. All the papers look like this: Here's a couple hundred features we tried, here are the dozen that mattered, here are the types of classifiers we tried but it didn't make much difference which kind we used.I think the more interesting work on this topic is when we include other data such as real-time user behavior data in the classification. That's when things start to get exciting.