For earch image, a set of representative features are extracted such as Color histogram, Color layout, Scalable color, CEDD, Edge histogram, Tamura. Strangely enough the authors are not considering SIFT and wavelet signatures, which are quite popular these days.
Authors evaluated three different algorithms:
- The folding algorithm appreciates the original ranking of the search results as returned by the textual retrieval model. Images higher in the ranking have a larger probability as being selected as a cluster representative. In one linear pass the representatives are selected, the clusters are then formed around them.
- The maxmin approach also performs representative selection prior to cluster formation,but discards the original ranking and finds representatives that are visually different from each other.
- Reciprocal election lets all the images cast votes for other images that they are best represented by. Strong voters are then assigned to their corresponding representatives, and taken off the list of candidates. This process is repeated as long as there exist
Evaluation measures are the Folwkes-Mallow index and the Variation of Information Criterion. Folding is the best algorithm under the FM index evaluation, and Reciprocal is the best one according to the Variation of Information Criterion.
I reccomend reading the paper if you are interested in Image search and in Clustering. Just one observation to the authors. Since the target of this work is an use in production, I would have appreciated a comparison of the time needed to extract the features and to cluster the results with the three different algorithms.