Fully First-Order Methods for Decentralized Bilevel Optimization
Xiaoyu Wang, Xuxing Chen, Shiqian Ma, and Tong Zhang

TL;DR
This paper introduces a decentralized first-order method for bilevel optimization that achieves near-optimal convergence rates with reduced computational complexity, suitable for multi-agent systems with limited communication.
Contribution
The paper proposes DSGDA-GT, a first-order decentralized algorithm for bilevel optimization, with proven finite-time convergence matching single-agent performance and improved efficiency.
Findings
Achieves $ ilde{O}(n^{-1} ext{epsilon}^{-7})$ sample complexity for $n$ agents.
Demonstrates communication and training efficiency through numerical experiments.
Matches the best-known single-agent convergence results with linear speedup.
Abstract
This paper focuses on decentralized stochastic bilevel optimization (DSBO) where agents only communicate with their neighbors. We propose Decentralized Stochastic Gradient Descent and Ascent with Gradient Tracking (DSGDA-GT), a novel algorithm that only requires first-order oracles that are much cheaper than second-order oracles widely adopted in existing works. We further provide a finite-time convergence analysis showing that for agents collaboratively solving the DSBO problem, the sample complexity of finding an -stationary point in our algorithm is , which matches the currently best-known results of the single-agent counterpart with linear speedup. The numerical experiments demonstrate both the communication and training efficiency of our algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Variational Analysis · Advanced Optimization Algorithms Research · Matrix Theory and Algorithms
