Distributed Bilevel Optimization with Communication Compression
Yutong He, Jie Hu, Xinmeng Huang, Songtao Lu, Bin Wang, Kun Yuan

TL;DR
This paper introduces the first family of distributed bilevel optimization algorithms with communication compression, significantly reducing communication costs while maintaining convergence efficiency.
Contribution
It develops novel algorithms with unbiased compression and improved convergence, addressing bias and assumption limitations in distributed bilevel optimization.
Findings
Achieves 10x reduction in communication overhead.
Provides algorithms with linear speedup convergence.
Demonstrates effectiveness through numerical experiments.
Abstract
Stochastic bilevel optimization tackles challenges involving nested optimization structures. Its fast-growing scale nowadays necessitates efficient distributed algorithms. In conventional distributed bilevel methods, each worker must transmit full-dimensional stochastic gradients to the server every iteration, leading to significant communication overhead and thus hindering efficiency and scalability. To resolve this issue, we introduce the first family of distributed bilevel algorithms with communication compression. The primary challenge in algorithmic development is mitigating bias in hypergradient estimation caused by the nested structure. We first propose C-SOBA, a simple yet effective approach with unbiased compression and provable linear speedup convergence. However, it relies on strong assumptions on bounded gradients. To address this limitation, we explore the use of moving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
