Preconditioned Federated Learning

Zeyi Tao; Jindi Wu; Qun Li

arXiv:2309.11378·cs.LG·September 21, 2023·1 cites

Preconditioned Federated Learning

Zeyi Tao, Jindi Wu, Qun Li

PDF

Open Access

TL;DR

This paper introduces adaptive preconditioned algorithms for federated learning that improve communication efficiency and convergence, demonstrating state-of-the-art results in diverse data settings.

Contribution

It proposes novel adaptive federated learning algorithms using covariance matrix preconditioning, with theoretical guarantees and superior empirical performance.

Findings

01

Achieves state-of-the-art performance on i.i.d. data

02

Effective in non-i.i.d. data settings

03

Provides convergence guarantees for the proposed algorithms

Abstract

Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner. The standard optimization method in FL is Federated Averaging (FedAvg), which performs multiple local SGD steps between communication rounds. FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations. In this paper, we propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp). Proposed methods adopt adaptivity by using a novel covariance matrix preconditioner. Theoretically, we provide convergence guarantees for our algorithms. The empirical experiments show our methods achieve state-of-the-art performances on both i.i.d. and non-i.i.d. settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Stochastic Gradient Optimization Techniques · Traffic Prediction and Management Techniques

MethodsLocal SGD · Stochastic Gradient Descent