Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Jun Bai; Minghao Tong; Yang Liu; Zixia Jia; Zilong Zheng

arXiv:2508.19594·cs.CL·November 13, 2025

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Jun Bai, Minghao Tong, Yang Liu, Zixia Jia, Zilong Zheng

PDF

TL;DR

This paper investigates the specialization of experts in mixture-of-experts language models for better context faithfulness, proposing methods to identify and fine-tune these experts to improve grounding and reasoning in context-dependent tasks.

Contribution

It introduces Router Lens to identify context-faithful experts and proposes CEFT, a lightweight fine-tuning method that enhances context grounding in large language models.

Findings

01

CEFT matches or surpasses full fine-tuning performance

02

Router Lens effectively identifies context-faithful experts

03

Experts amplify attention to relevant context progressively

Abstract

Context faithfulness is essential for reliable reasoning in context-dependent scenarios. However, large language models often struggle to ground their outputs in the provided context, resulting in irrelevant responses. Inspired by the emergent expert specialization observed in mixture-of-experts architectures, this work investigates whether certain experts exhibit specialization in context utilization, offering a potential pathway toward targeted optimization for improved context faithfulness. To explore this, we propose Router Lens, a method that accurately identifies context-faithful experts. Our analysis reveals that these experts progressively amplify attention to relevant contextual information, thereby enhancing context grounding. Building on this insight, we introduce Context-faithful Expert Fine-Tuning (CEFT), a lightweight optimization approach that selectively fine-tunes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.