XPERT: Expert Knowledge Transfer for Effective Training of Language Models

Chang Liu; Boyu Shi; Xu Yang; Xin Geng

arXiv:2605.08842·cs.CL·May 12, 2026

XPERT: Expert Knowledge Transfer for Effective Training of Language Models

Chang Liu, Boyu Shi, Xu Yang, Xin Geng

PDF

TL;DR

XPERT introduces a framework to extract and reuse expert knowledge from MoE language models, enhancing training efficiency and performance across various NLP tasks.

Contribution

The paper presents a novel method for identifying, refining, and reusing expert knowledge from MoE LLMs to improve training effectiveness.

Findings

01

Models with reused expert knowledge outperform baselines in language understanding and dialogue tasks.

02

Reusing expert knowledge leads to faster convergence during training.

03

Cross-domain experts encode generalizable knowledge beneficial for multiple tasks.

Abstract

Mixture-of-Experts (MoE) language models organize knowledge into explicitly routed expert modules, making expert-level representations traceable and analyzable. By analyzing expert activation patterns in MoE large language models (LLMs), we find that a subset of experts is consistently activated across diverse knowledge domains. These common experts encode cross-domain, generalizable knowledge that is closely related to model generalization, naturally raising the question of how such identifiable expert knowledge can be practically reused. Motivated by this observation, we propose XPERT, a framework that extracts, consolidates, and reuses expert knowledge from pre-trained MoE LLMs to support more effective training of language models across different model scales. XPERT identifies cross-domain experts via inference-only analysis, refines their representations through tensor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.