Fast Adaptive Federated Bilevel Optimization

Feihu Huang

arXiv:2211.01122·cs.LG·November 15, 2022·1 cites

Fast Adaptive Federated Bilevel Optimization

Feihu Huang

PDF

Open Access

TL;DR

This paper introduces AdaFBiO, an adaptive federated bilevel optimization algorithm that efficiently handles distributed hierarchical machine learning tasks with nonconvex upper-level and strongly convex lower-level problems.

Contribution

The paper proposes a novel adaptive federated bilevel optimization algorithm with momentum-based variance reduction and local-SGD, achieving optimal sample and communication complexities.

Findings

01

Achieves sample complexity of O(}) for -stationary points.

02

Achieves communication complexity of O(}) for -stationary points.

03

Demonstrates efficiency on federated hyper-representation learning and data hyper-cleaning tasks.

Abstract

Bilevel optimization is a popular hierarchical model in machine learning, and has been widely applied to many machine learning tasks such as meta learning, hyperparameter learning and policy optimization. Although many bilevel optimization algorithms recently have been developed, few adaptive algorithm focuses on the bilevel optimization under the distributed setting. It is well known that the adaptive gradient methods show superior performances on both distributed and non-distributed optimization. In the paper, thus, we propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems, where the objective function of Upper-Level (UL) problem is possibly nonconvex, and that of Lower-Level (LL) problem is strongly convex. Specifically, our AdaFBiO algorithm builds on the momentum-based variance reduced technique and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques