Emergent Mind

New Streaming Algorithms for High Dimensional EMD and MST

(2111.03528)
Published Nov 5, 2021 in cs.DS

Abstract

We study streaming algorithms for two fundamental geometric problems: computing the cost of a Minimum Spanning Tree (MST) of an $n$-point set $X \subset {1,2,\dots,\Delta}d$, and computing the Earth Mover Distance (EMD) between two multi-sets $A,B \subset {1,2,\dots,\Delta}d$ of size $n$. We consider the turnstile model, where points can be added and removed. We give a one-pass streaming algorithm for MST and a two-pass streaming algorithm for EMD, both achieving an approximation factor of $\tilde{O}(\log n)$ and using polylog$(n,d,\Delta)$-space only. Furthermore, our algorithm for EMD can be compressed to a single pass with a small additive error. Previously, the best known sublinear-space streaming algorithms for either problem achieved an approximation of $O(\min{ \log n , \log (\Delta d)} \log n)$ [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20]. For MST, we also prove that any constant space streaming algorithm can only achieve an approximation of $\Omega(\log n)$, analogous to the $\Omega(\log n)$ lower bound for EMD of [Andoni-Indyk-Krauthgamer '08]. Our algorithms are based on an improved analysis of a recursive space partitioning method known generically as the Quadtree. Specifically, we show that the Quadtree achieves an $\tilde{O}(\log n)$ approximation for both EMD and MST, improving on the $O(\min{ \log n , \log (\Delta d)} \log n)$ approximation of [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20].

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.