Papers
Topics
Authors
Recent
2000 character limit reached

Coresets for $k$-Means and $k$-Median Clustering and their Applications (1810.12826v1)

Published 30 Oct 2018 in cs.CG

Abstract: $\renewcommand{\Re}{{\rm I!\hspace{-0.025em} R}} \newcommand{\eps}{{\varepsilon}} \newcommand{\Coreset}{{\mathcal{S}}} $ In this paper, we show the existence of small coresets for the problems of computing $k$-median and $k$-means clustering for points in low dimension. In other words, we show that given a point set $P$ in $\Red$, one can compute a weighted set $\Coreset \subseteq P$, of size $O(k \eps{-d} \log{n})$, such that one can compute the $k$-median/means clustering on $\Coreset$ instead of on $P$, and get an $(1+\eps)$-approximation. As a result, we improve the fastest known algorithms for $(1+\eps)$-approximate $k$-means and $k$-median clustering. Our algorithms have linear running time for a fixed $k$ and $\eps$. In addition, we can maintain the $(1+\eps)$-approximate $k$-median or $k$-means clustering of a stream when points are being only inserted, using polylogarithmic space and update time.

Citations (124)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.