Distributed Non-Convex First-Order Optimization and Information Processing: Lower Complexity Bounds and Rate Optimal Algorithms (1804.02729v4)

Published 8 Apr 2018 in math.OC, cs.DC, cs.IT, and math.IT

Abstract: We consider a class of popular distributed non-convex optimization problems, in which agents connected by a network $\mathcal{G}$ collectively optimize a sum of smooth (possibly non-convex) local objective functions. We address the following question: if the agents can only access the gradients of local functions, what are the fastest rates that any distributed algorithms can achieve, and how to achieve those rates. First, we show that there exist difficult problem instances, such that it takes a class of distributed first-order methods at least $\mathcal{O}(1/\sqrt{\xi(\mathcal{G})} \times \bar{L} /{\epsilon})$ communication rounds to achieve certain $\epsilon$-solution [where $\xi(\mathcal{G})$ denotes the spectral gap of the graph Laplacian matrix, and $\bar{L}$ is some Lipschitz constant]. Second, we propose (near) optimal methods whose rates match the developed lower rate bound (up to a polylog factor). The key in the algorithm design is to properly embed the classical polynomial filtering techniques into modern first-order algorithms. To the best of our knowledge, this is the first time that lower rate bounds and optimal methods have been developed for distributed non-convex optimization problems.

Citations (45)

View on Semantic Scholar