Coresets for Constrained Clustering: General Assignment Constraints and Improved Size Bounds (2301.08460v6)
Abstract: Designing small-sized \emph{coresets}, which approximately preserve the costs of the solutions for large datasets, has been an important research direction for the past decade. We consider coreset construction for a variety of general constrained clustering problems. We introduce a general class of assignment constraints, including capacity constraints on cluster centers, and assignment structure constraints for data points (modeled by a convex body $\mathcal{B}$). We give coresets for clustering problems with such general assignment constraints that significantly generalize and improve known results. Notable implications include the first $\varepsilon$-coreset for capacitated and fair $k$-Median with $m$ outliers in Euclidean spaces whose size is $\tilde{O}(m + k2 \varepsilon{-4})$, generalizing and improving upon the prior bounds in Braverman et al., FOCS' 22; Huang et al., ICLR' 23, and the first $\epsilon$-coreset of size $\mathrm{poly}(k \varepsilon{-1})$ for fault-tolerant clustering for various types of metric spaces.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.