Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems (1305.1922v1)

Published 8 May 2013 in cs.DS and math.NA

Abstract: In this paper we show how to accelerate randomized coordinate descent methods and achieve faster convergence rates without paying per-iteration costs in asymptotic running time. In particular, we show how to generalize and efficiently implement a method proposed by Nesterov, giving faster asymptotic running times for various algorithms that use standard coordinate descent as a black box. In addition to providing a proof of convergence for this new general method, we show that it is numerically stable, efficiently implementable, and in certain regimes, asymptotically optimal. To highlight the computational power of this algorithm, we show how it can used to create faster linear system solvers in several regimes: - We show how this method achieves a faster asymptotic runtime than conjugate gradient for solving a broad class of symmetric positive definite systems of equations. - We improve the best known asymptotic convergence guarantees for Kaczmarz methods, a popular technique for image reconstruction and solving overdetermined systems of equations, by accelerating a randomized algorithm of Strohmer and Vershynin. - We achieve the best known running time for solving Symmetric Diagonally Dominant (SDD) system of equations in the unit-cost RAM model, obtaining an O(m log^{3/2} n (log log n)^{1/2} log (log n / eps)) asymptotic running time by accelerating a recent solver by Kelner et al. Beyond the independent interest of these solvers, we believe they highlight the versatility of the approach of this paper and we hope that they will open the door for further algorithmic improvements in the future.

Citations (225)

View on Semantic Scholar

Summary

The paper extends Nesterov’s accelerated coordinate descent methods to achieve faster convergence in linear system solvers without extra iteration costs.
It employs probabilistic estimate sequences to offer theoretical guarantees and numerical stability across SPD, overdetermined, and SDD systems.
The approach provides practical speed-ups for large-scale applications such as image reconstruction, data analysis, and distributed optimization.

Essay on "Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems"

The paper by Yin Tat Lee and Aaron Sidford titled "Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems" presents significant contributions to the optimization domain by addressing the inefficiencies in existing coordinate descent methods. The authors propose an accelerated variant of the randomized coordinate descent method that markedly enhances convergence rates without incurring additional asymptotic costs per iteration. Their method achieves this by generalizing a framework initially conceptualized by Nesterov, optimizing its implementation for better practical and theoretical performance.

Overview and Methodology

The core of the authors' proposition is an efficient iteration technique for the Accelerated Coordinate Descent Method (ACDM). This pivotal technique allows achieving faster convergence rates, particularly for solving linear systems of equations. While maintaining asymptotic runtime parity with non-accelerated variants, the method is shown to be numerically stable and optimal within specific frameworks.

The authors further illustrate their approach's versatility by applying it to various linear systems, demonstrating improved asymptotic runtimes over existing methods. Notably, they introduce a sophisticated framework that generalizes Nesterov's ACDM, showcasing its numerical stability and optimality under certain conditions.

Key Contributions and Numerical Results

Some of the paper's key contributions are as follows:

Generalization of ACDM: The authors generalize Nesterov's method across a broader range of sampling probabilities, overcoming challenges posed by skewed probabilities. This generalization enables ACDM to improve upon any convergence rate achieved by earlier coordinate descent methods.
Mathematical Analysis: The convergence analysis utilizes probabilistic estimate sequences, establishing a theoretical structure that aligns well with practical efficiency.
Efficiency in Iteration: By assuming that the function and gradient computations can be performed in constant time, the new iteration scheme reduces potential additional costs in updating coordinates, a critical issue in prior analyses of ACDM.
Applicability to Linear Systems: The paper successfully applies ACDM to:
- Symmetric Positive Definite (SPD) systems, outperforming conjugate gradient methods in certain regimes.
- Overdetermined systems via accelerated Kaczmarz strategies, improving convergence guarantees beyond traditional methods.
- Symmetric Diagonally Dominant (SDD) systems, achieving significant speed-ups in the unit-cost RAM model.

Implications and Future Directions

The implications of these results are manifold. Practically, these improvements can accelerate a range of applications from image reconstruction to large-scale data analysis, where linear systems are prevalent. Theoretically, the framework set by Lee and Sidford paves the way for further innovations in both distributed and synchronous optimization algorithms. By unifying and extending prior work on randomized Kaczmarz and gradient descent methods, the authors set a precedent for future advancements in iterative algorithms that can efficiently handle ever-growing data scales and distributed computing environments.

Moreover, the enhanced understanding of efficient coordinate updates in accelerated frameworks could broaden avenues for optimization in more complex problems and computational settings. This includes potential extensions to nonlinear optimization problems where coordinate descent methods have proven useful.

In conclusion, Lee and Sidford's research represents substantial advancement in coordinate descent methodologies, impacting both theoretical and applied aspects of solving linear systems and optimization problems. Their meticulous approach in improving ACDM implementations provides a robust pathway for future work and optimization technologies adaptation in increasingly complex computation scenarios.

PDF Markdown