Rethinking Normalization Methods in Federated Learning
Zhixu Du, Jingwei Sun, Ang Li, Pin-Yu Chen, Jianyi Zhang, Hai "Helen", Li, Yiran Chen

TL;DR
This paper identifies external covariate shift as a key challenge in federated learning, explains why batch normalization fails, and demonstrates that layer normalization improves model convergence and accuracy in non-IID settings.
Contribution
It uncovers the external covariate shift problem in federated learning and advocates for layer normalization over batch normalization to enhance performance.
Findings
Layer normalization outperforms batch normalization in federated learning.
Models with layer normalization converge faster and achieve higher accuracy.
External covariate shift causes contribution obliteration in FL models.
Abstract
Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. In this work, we explicitly uncover external covariate shift problem in FL, which is caused by the independent local training processes on different devices. We demonstrate that external covariate shifts will lead to the obliteration of some devices' contributions to the global model. Further, we show that normalization layers are indispensable in FL since their inherited properties can alleviate the problem of obliterating some devices' contributions. However, recent works have shown that batch normalization, which is one of the standard components in many deep neural networks, will incur accuracy drop of the global model in FL. The essential reason for the failure of batch normalization in FL is poorly studied. We unveil that external covariate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning and ELM
MethodsBatch Normalization · Layer Normalization
