Understanding and Improving Model Averaging in Federated Learning on   Heterogeneous Data

Tailin Zhou; Zehong Lin; Jun Zhang; Danny H.K. Tsang

arXiv:2305.07845·cs.LG·June 3, 2024·1 cites

Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data

Tailin Zhou, Zehong Lin, Jun Zhang, Danny H.K. Tsang

PDF

Open Access 1 Repo

TL;DR

This paper investigates why model averaging in federated learning works well despite data heterogeneity, visualizes loss landscapes, decomposes loss factors, and proposes iterative moving averaging to enhance performance.

Contribution

It provides new insights into the geometric and loss-related factors influencing model averaging in FL and introduces IMA to improve accuracy and speed.

Findings

01

IMA improves FL accuracy on heterogeneous data

02

Loss landscape shows global models within client basins

03

Loss decomposition highlights data heterogeneity impact

Abstract

Model averaging is a widely adopted technique in federated learning (FL) that aggregates multiple client models to obtain a global model. Remarkably, model averaging in FL yields a superior global model, even when client models are trained with non-convex objective functions and on heterogeneous local datasets. However, the rationale behind its success remains poorly understood. To shed light on this issue, we first visualize the loss landscape of FL over client and global models to illustrate their geometric properties. The visualization shows that the client models encompass the global model within a common basin, and interestingly, the global model may deviate from the basin's center while still outperforming the client models. To gain further insights into model averaging in FL, we decompose the expected loss of the global model into five factors related to the client models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tailinzhou/fedima
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings