Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 161 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 79 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 441 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Improved Coresets for Euclidean $k$-Means (2211.08184v2)

Published 15 Nov 2022 in cs.CG and cs.LG

Abstract: Given a set of $n$ points in $d$ dimensions, the Euclidean $k$-means problem (resp. the Euclidean $k$-median problem) consists of finding $k$ centers such that the sum of squared distances (resp. sum of distances) from every point to its closest center is minimized. The arguably most popular way of dealing with this problem in the big data setting is to first compress the data by computing a weighted subset known as a coreset and then run any algorithm on this subset. The guarantee of the coreset is that for any candidate solution, the ratio between coreset cost and the cost of the original instance is less than a $(1\pm \varepsilon)$ factor. The current state of the art coreset size is $\tilde O(\min(k{2} \cdot \varepsilon{-2},k\cdot \varepsilon{-4}))$ for Euclidean $k$-means and $\tilde O(\min(k{2} \cdot \varepsilon{-2},k\cdot \varepsilon{-3}))$ for Euclidean $k$-median. The best known lower bound for both problems is $\Omega(k \varepsilon{-2})$. In this paper, we improve the upper bounds $\tilde O(\min(k{3/2} \cdot \varepsilon{-2},k\cdot \varepsilon{-4}))$ for $k$-means and $\tilde O(\min(k{4/3} \cdot \varepsilon{-2},k\cdot \varepsilon{-3}))$ for $k$-median. In particular, ours is the first provable bound that breaks through the $k2$ barrier while retaining an optimal dependency on $\varepsilon$.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.