One Router to Route Them All: Homogeneous Expert Routing for Heterogeneous Graph Transformers
Georgiy Shakirov, Albert Arakelov

TL;DR
This paper introduces Homogeneous Expert Routing (HER), a novel MoE layer for Heterogeneous Graph Transformers that promotes type-agnostic expertise, leading to improved performance and interpretability in heterogeneous graph learning.
Contribution
The paper proposes HER, an MoE-based approach that eliminates the need for type-specific experts in HGNNs, enhancing cross-type knowledge transfer and model generalization.
Findings
HER outperforms standard HGT and type-separated MoE baselines on multiple datasets.
Experts in HER specialize by semantic patterns rather than node types.
Regularizing type dependence improves model interpretability and efficiency.
Abstract
A common practice in heterogeneous graph neural networks (HGNNs) is to condition parameters on node/edge types, assuming types reflect semantic roles. However, this can cause overreliance on surface-level labels and impede cross-type knowledge transfer. We explore integrating Mixture-of-Experts (MoE) into HGNNs--a direction underexplored despite MoE's success in homogeneous settings. Crucially, we question the need for type-specific experts. We propose Homogeneous Expert Routing (HER), an MoE layer for Heterogeneous Graph Transformers (HGT) that stochastically masks type embeddings during routing to encourage type-agnostic specialization. Evaluated on IMDB, ACM, and DBLP for link prediction, HER consistently outperforms standard HGT and a type-separated MoE baseline. Analysis on IMDB shows HER experts specialize by semantic patterns (e.g., movie genres) rather than node types,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Healthcare · Domain Adaptation and Few-Shot Learning
