Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 89 tok/s
Gemini 3.0 Pro 56 tok/s
Gemini 2.5 Flash 158 tok/s Pro
Kimi K2 198 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Dual-tree $k$-means with bounded iteration runtime (1601.03754v1)

Published 14 Jan 2016 in cs.DS and cs.LG

Abstract: k-means is a widely used clustering algorithm, but for $k$ clusters and a dataset size of $N$, each iteration of Lloyd's algorithm costs $O(kN)$ time. Although there are existing techniques to accelerate single Lloyd iterations, none of these are tailored to the case of large $k$, which is increasingly common as dataset sizes grow. We propose a dual-tree algorithm that gives the exact same results as standard $k$-means; when using cover trees, we use adaptive analysis techniques to, under some assumptions, bound the single-iteration runtime of the algorithm as $O(N + k log k)$. To our knowledge these are the first sub-$O(kN)$ bounds for exact Lloyd iterations. We then show that this theoretically favorable algorithm performs competitively in practice, especially for large $N$ and $k$ in low dimensions. Further, the algorithm is tree-independent, so any type of tree may be used.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.