Spectral Manifold Regularization for Stable and Modular Routing in Deep MoE Architectures
Ibrahim Delibasoglu

TL;DR
This paper introduces SR-MoE, a spectral regularization method for Mixture of Experts architectures that enhances modularity and stability, preventing expert collapse and improving lifelong learning capabilities.
Contribution
The paper proposes a novel spectral regularization approach for MoE architectures that enforces structural modularity and stability, addressing expert collapse and interference issues.
Findings
SR-MoE maintains structural integrity with increasing depth.
Traditional gating accuracy drops up to 4.72%, SR-MoE reduces interference.
Spectral constraints enable localized expert updates and lifelong learning.
Abstract
Mixture of Experts (MoE) architectures enable efficient scaling of neural networks but suffer from expert collapse, where routing converges to a few dominant experts. This reduces model capacity and causes catastrophic interference during adaptation. We propose the Spectrally-Regularized Mixture of Experts (SR-MoE), which imposes geometric constraints on the routing manifold to enforce structural modularity. Our method uses dual regularization: spectral norm constraints bound routing function Lipschitz continuity, while stable rank penalties preserve high-dimensional feature diversity in expert selection. We evaluate SR-MoE across architectural scales and dataset complexities using modular one-shot adaptation tasks. Results show that traditional linear gating fails with increasing depth (accuracy drops up to 4.72% due to expert entanglement), while SR-MoE maintains structural integrity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Mobile Crowdsensing and Crowdsourcing
