Expert Routing for Communication-Efficient MoE via Finite Expert Banks
Mohammad Reza Deylam Salehi, Ali Khalesi

TL;DR
This paper introduces a finite-bank approach to analyze and optimize resource-efficient sparse Mixture-of-Experts models by quantifying routing information and generalization using information-theoretic measures.
Contribution
It develops a practical framework using finite expert banks and information theory to analyze and improve expert routing efficiency in sparse MoE architectures.
Findings
Mutual information I(S;W) tracks the generalization gap.
The Xu-Raginsky bound is looser than empirical estimates.
The framework enables analysis of resource-aware MoE inference systems.
Abstract
Resource-efficient machine learning increasingly uses sparse Mixture-of-Experts (MoE) architectures, where the gate acts as both a learning component and a routing interface controlling computation, communication, and accuracy. Motivated by finite-rate interpretations of MoE gating, we treat the gate as a stochastic channel and use to quantify the routing information available to the selected expert. To make the associated information quantities tractable beyond synthetic examples, we develop a finite-bank MNIST construction using pretrained CNN experts and a discrete, data-dependent selection rule. Since the selected model belongs to a finite candidate set, the algorithmic mutual information admits a closed-form discrete-entropy estimator from the empirical posterior . Sweeping a data-dependence parameter , we observe that …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
