Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data
Tailin Zhou, Zehong Lin, Jun Zhang, Danny H.K. Tsang

TL;DR
This paper investigates why model averaging in federated learning works well despite data heterogeneity, visualizes loss landscapes, decomposes loss factors, and proposes iterative moving averaging to enhance performance.
Contribution
It provides new insights into the geometric and loss-related factors influencing model averaging in FL and introduces IMA to improve accuracy and speed.
Findings
IMA improves FL accuracy on heterogeneous data
Loss landscape shows global models within client basins
Loss decomposition highlights data heterogeneity impact
Abstract
Model averaging is a widely adopted technique in federated learning (FL) that aggregates multiple client models to obtain a global model. Remarkably, model averaging in FL yields a superior global model, even when client models are trained with non-convex objective functions and on heterogeneous local datasets. However, the rationale behind its success remains poorly understood. To shed light on this issue, we first visualize the loss landscape of FL over client and global models to illustrate their geometric properties. The visualization shows that the client models encompass the global model within a common basin, and interestingly, the global model may deviate from the basin's center while still outperforming the client models. To gain further insights into model averaging in FL, we decompose the expected loss of the global model into five factors related to the client models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
