Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Do you know what q-means? (2308.09701v2)

Published 18 Aug 2023 in quant-ph, cs.DS, and cs.LG

Abstract: Clustering is one of the most important tools for analysis of large datasets, and perhaps the most popular clustering algorithm is Lloyd's iteration for $k$-means. This iteration takes $n$ vectors $V=[v_1,\dots,v_n]\in\mathbb{R}{n\times d}$ and outputs $k$ centroids $c_1,\dots,c_k\in\mathbb{R}d$; these partition the vectors into clusters based on which centroid is closest to a particular vector. We present an overall improved version of the "$q$-means" algorithm, the quantum algorithm originally proposed by Kerenidis, Landman, Luongo, and Prakash (NeurIPS'19) which performs $\varepsilon$-$k$-means, an approximate version of $k$-means clustering. Our algorithm does not rely on quantum linear algebra primitives of prior work, but instead only uses QRAM to prepare simple states based on the current iteration's clusters and multivariate quantum amplitude estimation. The time complexity is $\widetilde{O}\big(\frac{|V|_F}{\sqrt{n}}\frac{k{5/2}d}{\varepsilon}(\sqrt{k} + \log{n})\big)$ and maintains the logarithmic dependence on $n$ while improving the dependence on most of the other parameters. We also present a "dequantized" algorithm for $\varepsilon$-$k$-means which runs in $O\big(\frac{|V|_F2}{n}\frac{k{2}}{\varepsilon2}(kd + \log{n})\big)$ time. Notably, this classical algorithm matches the logarithmic dependence on $n$ attained by the quantum algorithm.

Citations (3)

Summary

We haven't generated a summary for this paper yet.