Experimenting with Normalization Layers in Federated Learning on non-IID scenarios
Bruno Casella, Roberto Esposito, Antonio Sciarappa, Carlo Cavazzoni,, Marco Aldinucci

TL;DR
This paper evaluates different normalization layers and collaboration frequencies in federated learning on non-IID data, showing that Group and Layer Normalization outperform Batch Normalization, and less frequent aggregation improves convergence.
Contribution
It benchmarks five normalization layers and analyzes their impact on federated learning with non-IID data, providing guidance for optimizing FL performance.
Findings
Group and Layer Normalization outperform Batch Normalization in FL.
Less frequent model aggregation improves convergence speed and model quality.
Normalization choice significantly affects federated learning on non-IID datasets.
Abstract
Training Deep Learning (DL) models require large, high-quality datasets, often assembled with data from different institutions. Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions by iteratively globally aggregating locally trained models. One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants. Even though this fragility cannot be eliminated, it can be debunked by a suitable optimization of two hyper-parameters: layer normalization methods and collaboration frequency selection. In this work, we benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets. Results show that Batch Normalization, widely employed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
MethodsLayer Normalization · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Batch Normalization
