LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

Nam V. Nguyen; Thong T. Doan; Luong Tran; Van Nguyen; Quang Pham

arXiv:2411.00918·cs.CL·February 11, 2026

LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

Nam V. Nguyen, Thong T. Doan, Luong Tran, Van Nguyen, Quang Pham

PDF

Open Access 1 Repo 1 Models 1 Datasets

TL;DR

LibMoE is a comprehensive, open-source framework that enables efficient, reproducible research on Mixture of Experts models in large language models, providing tools for analysis and benchmarking.

Contribution

It introduces a unified framework for MoE research that supports various training regimes and offers analytical tools, lowering barriers and standardizing evaluation.

Findings

01

Routing entropy reveals task specialization and expert diversity.

02

Lightweight initialization affects early expert load balancing.

03

Different training regimes exhibit distinct routing stability profiles.

Abstract

Mixture of experts (MoE) architectures have become a cornerstone for scaling up and are a key component in most large language models such as GPT-OSS, DeepSeek-V3, Llama-4, and Gemini-2.5. However, systematic research on MoE remains severely constrained by the prohibitive computational costs of training and evaluation, restricting large-scale studies accessible to most researchers. We introduce LibMoE, a unified framework for reproducible, efficient, and extensible MoE research that supports both pretraining and sparse-upcycling regimes. Beyond unified implementations, the framework provides transparent analytical tools for probing routing and expert dynamics. Leveraging this foundation, we conduct a comprehensive analysis along three dimensions: (i) routing dynamics, covering expert selection patterns, routing stability and optimality, and how routing entropy reveals task…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Fsoft-AIC/LibMoE
pytorchOfficial

Models

🤗
Fsoft-AIC/Phi3.5-Siglip-MoE
model

Datasets

DavidNguyen/LLAVA-LibMoE
dataset· 766 dl
766 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExpert finding and Q&A systems · Topic Modeling

MethodsMixture of Experts