Antonio Gulli's coding playground: August 2014

Sunday, August 31, 2014

Search is dead, Awareness is the new king

Search is dead. Well, perhaps is not but I am provocative this morning. The first search engine was launched in 1993. Since then we made an incredible progress in terms of content discovery, indexing, scalability, ranking, machine learning, variety, and UX. However, the paradigm is still the same. You need to have an idea of what you are searching well before starting submitting queries based on keywords. Exactly like in 1993.

I think that Awareness is the new king. For awareness, I mean something that send to you the information you like to get by working on your behalf with no need of explicit searches. To be honest, the topic is not new. Patty Maes works on Intelligent Software Agents since 1987. However, I don't see a disruptive revolution. We still receive alerts via Google Alerts, which is based on explicit queries. Some initial steps - but not a revolution - are shown in Google Now where the location is the implicit query. Some other changes are in Google Glass, where the image captured might be the surrogate for query. Still is a world where 95% of our actions are described and learned via keywords.

What do you think?

Saturday, August 30, 2014

Internet of things: what is new?

We live in a world of catchy words. What was called Parallel Computation yesterday, became NoW (Network of Workstation), Grid and then Cloud. Same thing different words.
Another example is IoT, which is pretty much similar to Home Automation something discussed during the past 40 years.

So what is cool with IoT?

Frankly, I don't know. However, got an idea there and filed a patent. Let's see what will happen.

Friday, August 29, 2014

Learning something new: Apache Spark

Any suggestion about it?

Friday, August 22, 2014

Assignment matching problem

Assume that we have workers and tasks to be completed. For each pair (worker, task) we know the costs that should be paid per worker to conclude the task. The goal is to conclude all the tasks and to minimize the total cost, under the condition that each worker can execute only one task and vice versa.

Thursday, August 21, 2014

Ford–Fulkerson’s Algorithm for Flow Network

Wednesday, August 20, 2014

Bellman Ford Algorithm

Monday, August 18, 2014

Find All_Pairs_Shortest_Path (Floyd-Warshall Algorithm)

Sunday, August 17, 2014

Find MST(Minimum Spanning Tree) using Prim’s Algorithm

Saturday, August 16, 2014

Find the longest path in a DAG

Friday, August 15, 2014

Antonio Gulli Adaboost

Adaboost is one of my favorite Machine Learning algorithm. The idea is quite intriguing: You start from a set of weak classifiers and learn how to linearly combine them so that the error is reduced. The result is a strong classifier built by boosting the weak classifiers.

Mathematically, given a set of labels y_i = { +1, -1 } and a training dataset x_i, Adaboost minimizes the exponential loss function sum_i (exp - (y_i * f(x_i)) }. The function f = sum_t (alpha_t h_t) is a linear combination of the h_t classifier with weight alpha_t. For those loving Optimization Theory , Adaboost is a classical application of Gradient Descend.
The algorithm is quite simple and has been included in the top 10 data mining algorithms in 2007 and the Gödel prize in 2003. After Adaboost, Boostingbecome quite popular in the data mining community with application in Ranking and Clustering.

Here you have the code for AdaBoosting in C++ and Boost.

Antonio Gulli KMeans

K-means is a classical clustering algorithm..
Here you have a C++ code for K-means clustering.

(Edit: 12/05/013)
See also my more recent posting A new lighter implementation of K-means