Scaling Machine Learning Interatomic Potentials with Mixtures of Experts
Yuzhi Liu, Duo Zhang, Anyang Peng, Weinan E, Linfeng Zhang, Han Wang

TL;DR
This paper introduces advanced Mixture-of-Experts architectures for machine learning interatomic potentials, significantly improving accuracy and interpretability in atomistic simulations by leveraging specialized expert routing strategies.
Contribution
It develops and analyzes MoE and MoLE architectures for MLIPs, demonstrating superior performance and interpretability through element-wise routing and expert specialization.
Findings
Sparse activation with shared experts improves performance.
Nonlinear MoE outperforms MoLE with shared experts.
Element-wise routing yields better accuracy and interpretability.
Abstract
Machine Learning Interatomic Potentials (MLIPs) enable accurate large-scale atomistic simulations, yet improving their expressive capacity efficiently remains challenging. Here we systematically develop Mixture-of-Experts (MoE) and Mixture-of-Linear-Experts (MoLE) architectures for MLIPs and analyze the effects of routing strategies and expert designs. We show that sparse activation combined with shared experts yields substantial performance gains, and that nonlinear MoE formulations outperform MoLE when shared experts are present, underscoring the importance of nonlinear expert specialization. Furthermore, element-wise routing consistently surpasses configuration-level routing, while global MoE routing often leads to numerical instability. The resulting element-wise MoE model achieves state-of-the-art accuracy across the OMol25, OMat24, and OC20M benchmarks. Analysis of routing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Block Copolymer Self-Assembly
