Multi-Domain Learning with Global Expert Mapping
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Oscar Mendez, Dacheng Tao, and Xuelong Li

TL;DR
GEM introduces a global expert mapping framework for mixture-of-experts models, improving domain specialization and performance across diverse datasets without balancing constraints.
Contribution
It replaces learned routing with a linear programming-based global scheduler, enabling effective expert specialization and interpretability in multi-domain learning.
Findings
GEM achieves state-of-the-art results on UODB benchmark.
Notable improvements on underrepresented datasets.
Resolves task interference in few-shot adaptation.
Abstract
Human perception generalizes well across different domains, but most vision models struggle beyond their training data. This gap motivates multi-dataset learning, where a single model is trained on diverse datasets to improve robustness under domain shifts. However, unified training remains challenging due to inconsistencies in data distributions and label semantics. Mixture-of-Experts (MoE) models provide a scalable solution by routing inputs to specialized subnetworks (experts). Yet, existing MoEs often fail to specialize effectively, as their load-balancing mechanisms enforce uniform input distribution across experts. This fairness conflicts with domain-aware routing, causing experts to learn redundant representations, and reducing performance especially on rare or out-of-distribution domains. We propose GEM (Global Expert Mapping), a planner-compiler framework that replaces the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
