Random commentary about Machine Learning, BigData, Spark, Deep Learning, C++, STL, Boost, Perl, Python, Algorithms, Problem Solving and Web Search

Friday, August 15, 2014

Antonio Gulli Adaboost

Adaboost is one of my favorite Machine Learning algorithm. The idea is quite intriguing: You start from a set of weak classifiers and learn how to linearly combine them so that the error is reduced. The result is a strong classifier built by boosting the weak classifiers.

Mathematically, given a set of labels y_i = { +1, -1 } and a training dataset x_i, Adaboost minimizes the exponential loss function sum_i (exp - (y_i * f(x_i)) }. The function f = sum_t (alpha_t h_t) is a linear combination of the h_t classifier with weight alpha_t. For those loving Optimization Theory , Adaboost is a classical application of Gradient Descend. The algorithm is quite simple and has been included in the top 10 data mining algorithms in 2007 and the Gödel prize in 2003. After Adaboost, Boostingbecome quite popular in the data mining community with application in Ranking and Clustering.

## No comments:

## Post a Comment