Emergent Mind

$\ell_p$-Regression in the Arbitrary Partition Model of Communication

(2307.05117)
Published Jul 11, 2023 in cs.DS , cs.DC , and cs.LG

Abstract

We consider the randomized communication complexity of the distributed $\ellp$-regression problem in the coordinator model, for $p\in (0,2]$. In this problem, there is a coordinator and $s$ servers. The $i$-th server receives $Ai\in{-M, -M+1, \ldots, M}{n\times d}$ and $bi\in{-M, -M+1, \ldots, M}n$ and the coordinator would like to find a $(1+\epsilon)$-approximate solution to $\min{x\in\mathbb{R}n} |(\sumi Ai)x - (\sumi bi)|_p$. Here $M \leq \mathrm{poly}(nd)$ for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $\tilde{\Theta}(sd2 + sd/\epsilon)$ bits. For $p \in (1,2)$,we obtain an $\tilde{O}(sd2/\epsilon + sd/\mathrm{poly}(\epsilon))$ upper bound. Notably, for $d$ sufficiently large, our leading order term only depends linearly on $1/\epsilon$ rather than quadratically. We also show communication lower bounds of $\Omega(sd2 + sd/\epsilon2)$ for $p\in (0,1]$ and $\Omega(sd2 + sd/\epsilon)$ for $p\in (1,2]$. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020).

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.