Claire Kenyon (Ecole Polytechnique, Paris, France):
Metric Clustering
Given n data items in a general metric space, how can we efficiently
partition them into k clusters of "similar" items? There are many
models for this ubiquitous problem, which arises in application areas
such as data mining, image recognition, medical imaging,
web analysis, etc. One measure of the quality of a k-clustering is
the sum of all pairwise distances inside the clusters, which
must be minimized. We discuss techniques and algorithms,
first for the complementary problem, which can be seen as a metric analog
of Max-Cut in graphs, then for 2-clustering, and
finally sketch extensions to variants with other objective
functions or with cardinality constraints.
The algorithms are based on random sampling.