Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 30 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 184 tok/s Pro
GPT OSS 120B 462 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

A Penalty-Based Method for Communication-Efficient Decentralized Bilevel Programming (2211.04088v4)

Published 8 Nov 2022 in cs.LG, cs.DC, and math.OC

Abstract: Bilevel programming has recently received attention in the literature due to its wide range of applications, including reinforcement learning and hyper-parameter optimization. However, it is widely assumed that the underlying bilevel optimization problem is solved either by a single machine or, in the case of multiple machines connected in a star-shaped network, i.e., in a federated learning setting. The latter approach suffers from a high communication cost on the central node (e.g., parameter server). Hence, there is an interest in developing methods that solve bilevel optimization problems in a communication-efficient, decentralized manner. To that end, this paper introduces a penalty function-based decentralized algorithm with theoretical guarantees for this class of optimization problems. Specifically, a distributed alternating gradient-type algorithm for solving consensus bilevel programming over a decentralized network is developed. A key feature of the proposed algorithm is the estimation of the hyper-gradient of the penalty function through decentralized computation of matrix-vector products and a few vector communications. The estimation is integrated into an alternating algorithm for solving the penalized reformulation of the bilevel optimization problem. Under appropriate step sizes and penalty parameters, our theoretical framework ensures non-asymptotic convergence to the optimal solution of the original problem under various convexity conditions. Our theoretical result highlights improvements in the iteration complexity of decentralized bilevel optimization, all while making efficient use of vector communication. Empirical results demonstrate that the proposed method performs well in real-world settings.

Citations (4)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.