Federated Hypergradient Descent
Andrew K Kan

TL;DR
This paper introduces FATHOM, a one-shot, online hyperparameter optimization method for federated learning that reduces communication and computation costs by adaptively tuning hyperparameters without trial-and-error.
Contribution
FATHOM is a novel, analytical gradient-based hyperparameter optimization method for federated learning that is more communication and computationally efficient than traditional static hyperparameter tuning.
Findings
FATHOM outperforms FedAvg with static hyperparameters in communication efficiency.
FATHOM reduces overall computational costs during training.
Empirical results show improved model performance on FEMNIST and Stack Overflow datasets.
Abstract
In this work, we explore combining automatic hyperparameter tuning and optimization for federated learning (FL) in an online, one-shot procedure. We apply a principled approach on a method for adaptive client learning rate, number of local steps, and batch size. In our federated learning applications, our primary motivations are minimizing communication budget as well as local computational resources in the training pipeline. Conventionally, hyperparameter tuning methods involve at least some degree of trial-and-error, which is known to be sample inefficient. In order to address our motivations, we propose FATHOM (Federated AuTomatic Hyperparameter OptiMization) as a one-shot online procedure. We investigate the challenges and solutions of deriving analytical gradients with respect to the hyperparameters of interest. Our approach is inspired by the fact that, with the exception of local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Data Stream Mining Techniques
