A Single-Loop Algorithm for Decentralized Bilevel Optimization

Youran Dong; Shiqian Ma; Junfeng Yang; Chao Yin

arXiv:2311.08945·math.OC·April 24, 2024·1 cites

A Single-Loop Algorithm for Decentralized Bilevel Optimization

Youran Dong, Shiqian Ma, Junfeng Yang, Chao Yin

PDF

Open Access

TL;DR

This paper introduces a novel single-loop decentralized bilevel optimization algorithm that efficiently approximates hypergradients without gradient heterogeneity assumptions, achieving optimal convergence rates and demonstrating superior performance in hyperparameter tuning tasks.

Contribution

It presents the first fully single-loop algorithm for decentralized bilevel optimization that does not rely on gradient heterogeneity assumptions, with proven optimal convergence rates.

Findings

01

Achieves the best-known convergence rate for bilevel optimization algorithms.

02

Demonstrates efficiency through experiments on hyperparameter optimization tasks.

03

Does not require gradient heterogeneity assumptions, unlike existing methods.

Abstract

Bilevel optimization has gained significant attention in recent years due to its broad applications in machine learning. This paper focuses on bilevel optimization in decentralized networks and proposes a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem. Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration. Importantly, our algorithm does not require any gradient heterogeneity assumption, distinguishing it from existing methods for decentralized bilevel optimization and federated bilevel optimization. Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms. We also present experimental results on hyperparameter optimization problems using both synthetic and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques