Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg
Like Jian, Dong Liu

TL;DR
This paper demonstrates that increasing neural network width in federated learning reduces data heterogeneity effects, leading to convergence and generalization performance comparable to centralized training, supported by theoretical analysis and experiments.
Contribution
It provides a theoretical analysis showing that wider networks mitigate data heterogeneity in FedAvg, with proofs that infinite width leads to linear models and similar performance to centralized learning.
Findings
Impact of data heterogeneity diminishes with network width
Infinite-width networks behave as linear models
FedAvg matches centralized learning performance at large widths
Abstract
Federated learning (FL) enables decentralized clients to train a model collaboratively without sharing local data. A key distinction between FL and centralized learning is that clients' data are non-independent and identically distributed, which poses significant challenges in training a global model that generalizes well across heterogeneous local data distributions. In this paper, we analyze the convergence of overparameterized FedAvg with gradient descent (GD). We prove that the impact of data heterogeneity diminishes as the width of neural networks increases, ultimately vanishing when the width approaches infinity. In the infinite-width regime, we further prove that both the global and local models in FedAvg behave as linear models, and that FedAvg achieves the same generalization performance as centralized learning with the same number of GD iterations. Extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Traffic and Congestion Control · Network Security and Intrusion Detection · Advanced Optical Network Technologies
