Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 60 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 14 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 159 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Towards Optimal Lower Bounds for k-median and k-means Coresets (2202.12793v1)

Published 25 Feb 2022 in cs.DS, cs.CG, and cs.LG

Abstract: Given a set of points in a metric space, the $(k,z)$-clustering problem consists of finding a set of $k$ points called centers, such that the sum of distances raised to the power of $z$ of every data point to its closest center is minimized. Special cases include the famous k-median problem ($z = 1$) and k-means problem ($z = 2$). The $k$-median and $k$-means problems are at the heart of modern data analysis and massive data applications have given raise to the notion of coreset: a small (weighted) subset of the input point set preserving the cost of any solution to the problem up to a multiplicative $(1 \pm \varepsilon)$ factor, hence reducing from large to small scale the input to the problem. In this paper, we present improved lower bounds for coresets in various metric spaces. In finite metrics consisting of $n$ points and doubling metrics with doubling constant $D$, we show that any coreset for $(k,z)$ clustering must consist of at least $\Omega(k \varepsilon{-2} \log n)$ and $\Omega(k \varepsilon{-2} D)$ points, respectively. Both bounds match previous upper bounds up to polylog factors. In Euclidean spaces, we show that any coreset for $(k,z)$ clustering must consists of at least $\Omega(k\varepsilon{-2})$ points. We complement these lower bounds with a coreset construction consisting of at most $\tilde{O}(k\varepsilon{-2}\cdot \min(\varepsilon{-z},k))$ points.

Citations (42)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.