Clustering under Perturbation Stability in Near-Linear Time

Published 30 Sep 2020 in cs.DS and cs.CG | (2009.14358v1)

Abstract: We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is $\alpha$-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most $\alpha$. Our main contribution is in presenting efficient exact algorithms for $\alpha$-stable clustering instances whose running times depend near-linearly on the size of the data set when $\alpha \ge 2 + \sqrt{3}$. For $k$-center and $k$-means problems, our algorithms also achieve polynomial dependence on the number of clusters, $k$, when $\alpha \geq 2 + \sqrt{3} + \epsilon$ for any constant $\epsilon > 0$ in any fixed dimension. For $k$-median, our algorithms have polynomial dependence on $k$ for $\alpha > 5$ in any fixed dimension; and for $\alpha \geq 2 + \sqrt{3}$ in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.