FedEMA-Distill: Exponential Moving Average Guided Knowledge Distillation for Robust Federated Learning
Hamza Reguieg, Mohamed El Kamili, Essaid Sabir

TL;DR
FedEMA-Distill introduces a server-side method combining exponential moving averages and ensemble knowledge distillation to enhance robustness, efficiency, and heterogeneity support in federated learning, especially under non-IID data and adversarial conditions.
Contribution
The paper presents a novel federated learning approach that uses EMA and logits-based knowledge distillation, supporting heterogeneous models and improving robustness and communication efficiency.
Findings
Improves accuracy by up to 6% on benchmark datasets.
Reduces communication rounds by 30-35%.
Enhances robustness against Byzantine clients.
Abstract
Federated learning (FL) often degrades when clients hold heterogeneous non-Independent and Identically Distributed (non-IID) data and when some clients behave adversarially, leading to client drift, slow convergence, and high communication overhead. This paper proposes FedEMA-Distill, a server-side procedure that combines an exponential moving average (EMA) of the global model with ensemble knowledge distillation from client-uploaded prediction logits evaluated on a small public proxy dataset. Clients run standard local training, upload only compressed logits, and may use different model architectures, so no changes are required to client-side software while still supporting model heterogeneity across devices. Experiments on CIFAR-10, CIFAR-100, FEMNIST, and AG News under Dirichlet-0.1 label skew show that FedEMA-Distill improves top-1 accuracy by several percentage points (up to +5% on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Domain Adaptation and Few-Shot Learning
