FedCoE: Bridging Generalization and Personalization via Federated Coordinated Dual-level MoEs
Penglin Dai, Fulian Li, Xincao Xu, Junhua Wang, Lixin Duan, Xiao Wu

TL;DR
FedCoE introduces a dual-level mixture-of-Experts framework for federated learning that enhances both global generalization and local personalization, effectively addressing non-IID data and cold-start issues.
Contribution
It proposes a novel federated mixture-of-Experts model with a shared gating network and an adaptive cold-start mechanism, improving accuracy and personalization over existing methods.
Findings
Achieves 78.00% global accuracy and 89.32% personalized accuracy, outperforming baselines.
Delivers 77.27% accuracy in cold-start scenarios without local fine-tuning.
Effectively mitigates expert drift and gating inconsistency.
Abstract
Federated Learning (FL) has emerged as a promising paradigm for privacy-preserving distributed learning. However, existing FL methods face a fundamental challenge. Traditional averaging-based approaches suffer from parameter divergence under non-IID conditions, while personalized FL methods overfit to local data and fail to generalize to new clients (cold-start problem). Mixture-of-Experts naturally addresses this by routing heterogeneous data to specialized experts rather than forcing uniform aggregation. In this paper, we propose FedCoE, a Federated Coordinated dual-level mixture-of-Experts framework that effectively balances global generalization with local personalization. FedCoE maintains multiple independent global expert models on the server and employs a shared gating network to dynamically model client-expert correlations during aggregation, effectively mitigating expert drift…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
