FedHyper: A Universal and Robust Learning Rate Scheduler for Federated   Learning with Hypergradient Descent

Ziyao Wang; Jianyu Wang; Ang Li

arXiv:2310.03156·cs.LG·October 9, 2023·1 cites

FedHyper: A Universal and Robust Learning Rate Scheduler for Federated Learning with Hypergradient Descent

Ziyao Wang, Jianyu Wang, Ang Li

PDF

Open Access

TL;DR

FedHyper introduces a hypergradient-based learning rate scheduler for federated learning, enabling faster convergence and higher accuracy with reduced need for manual hyperparameter tuning.

Contribution

It proposes a universal, robust learning rate adaptation algorithm for federated learning that improves convergence speed and accuracy, addressing hyperparameter tuning challenges.

Findings

01

Converges 1.1-3x faster than FedAvg.

02

Achieves up to 15% higher accuracy under suboptimal initial rates.

03

Demonstrates robustness across vision and language benchmarks.

Abstract

The theoretical landscape of federated learning (FL) undergoes rapid evolution, but its practical application encounters a series of intricate challenges, and hyperparameter optimization is one of these critical challenges. Amongst the diverse adjustments in hyperparameters, the adaptation of the learning rate emerges as a crucial component, holding the promise of significantly enhancing the efficacy of FL systems. In response to this critical need, this paper presents FedHyper, a novel hypergradient-based learning rate adaptation algorithm specifically designed for FL. FedHyper serves as a universal learning rate scheduler that can adapt both global and local rates as the training progresses. In addition, FedHyper not only showcases unparalleled robustness to a spectrum of initial learning rate configurations but also significantly alleviates the necessity for laborious empirical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications