Shift Happens: Mixture of Experts based Continual Adaptation in Federated Learning

Rahul Atul Bhope; K.R. Jayaram; Praveen Venkateswaran; Nalini Venkatasubramanian

arXiv:2506.18789·cs.LG·September 26, 2025

Shift Happens: Mixture of Experts based Continual Adaptation in Federated Learning

Rahul Atul Bhope, K.R. Jayaram, Praveen Venkateswaran, Nalini Venkatasubramanian

PDF

TL;DR

This paper presents ShiftEx, a mixture of experts framework for federated learning that adaptively handles distribution shifts in streaming data, significantly improving accuracy and adaptation speed in non-stationary environments.

Contribution

It introduces a novel shift-aware mixture of experts model with a latent memory and optimization strategy for dynamic adaptation to distributional shifts in federated learning.

Findings

01

Achieves 5.5-12.9% accuracy improvement over baselines.

02

Adapts 22-95% faster to distribution shifts.

03

Effectively handles covariate and label shifts in FL.

Abstract

Federated Learning (FL) enables collaborative model training across decentralized clients without sharing raw data, yet faces significant challenges in real-world settings where client data distributions evolve dynamically over time. This paper tackles the critical problem of covariate and label shifts in streaming FL environments, where non-stationary data distributions degrade model performance and necessitate a middleware layer that adapts FL to distributional shifts. We introduce ShiftEx, a shift-aware mixture of experts framework that dynamically creates and trains specialized global models in response to detected distribution shifts using Maximum Mean Discrepancy for covariate shifts. The framework employs a latent memory mechanism for expert reuse and implements facility location-based optimization to jointly minimize covariate mismatch, expert creation costs, and label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.