Achieving anonymity via weak lower bound constraints for k-median and k-means (2009.03078v3)

Published 7 Sep 2020 in cs.DS

Abstract: We study $k$-clustering problems with lower bounds, including $k$-median and $k$-means clustering with lower bounds. In addition to the point set $P$ and the number of centers $k$, a $k$-clustering problem with (uniform) lower bounds gets a number $B$. The solution space is restricted to clusterings where every cluster has at least $B$ points. We demonstrate how to approximate $k$-median with lower bounds via a reduction to facility location with lower bounds, for which $O(1)$-approximation algorithms are known. Then we propose a new constrained clustering problem with lower bounds where we allow points to be assigned multiple times (to different centers). This means that for every point, the clustering specifies a set of centers to which it is assigned. We call this clustering with weak lower bounds. We give a $(6.5+\epsilon)$-approximation for $k$-median clustering with weak lower bounds and an $O(1)$-approximation for $k$-means with weak lower bounds. We conclude by showing that at a constant increase in the approximation factor, we can restrict the number of assignments of every point to $2$ (or, if we allow fractional assignments, to $1+\epsilon$). This also leads to the first bicritera approximation algorithm for $k$-means with (standard) lower bounds where bicriteria is interpreted in the sense that the lower bounds are violated by a constant factor. All algorithms in this paper run in time that is polynomial in $n$ and $k$ (and $d$ for the Euclidean variants considered).

Citations (8)

View on Semantic Scholar