Routing Sensitivity Without Controllability: A Diagnostic Study of Fairness in MoE Language Models

Junhyeok Lee; Kyu Sung Choi

arXiv:2603.27141·cs.CL·April 13, 2026

Routing Sensitivity Without Controllability: A Diagnostic Study of Fairness in MoE Language Models

Junhyeok Lee, Kyu Sung Choi

PDF

TL;DR

This paper introduces FARE, a diagnostic framework to evaluate the limits of routing-level fairness interventions in MoE language models, revealing structural and utility trade-offs.

Contribution

It provides a systematic analysis showing that routing sensitivity alone cannot reliably control stereotypes in MoE models, highlighting architectural constraints.

Findings

01

Routing-level preference shifts are often unachievable or non-robust.

02

Bias and knowledge are entangled within expert groups.

03

Routing sensitivity does not translate into improved generation fairness.

Abstract

Mixture-of-Experts (MoE) language models are universally sensitive to demographic content at the routing level, yet exploiting this sensitivity for fairness control is structurally limited. We introduce Fairness-Aware Routing Equilibrium (FARE), a diagnostic framework designed to probe the limits of routing-level stereotype intervention across diverse MoE architectures. FARE reveals that routing-level preference shifts are either unachievable (Mixtral, Qwen1.5, Qwen3), statistically non-robust (DeepSeekMoE), or accompanied by substantial utility cost (OLMoE, -4.4%p CrowS-Pairs at -6.3%p TQA). Critically, even where log-likelihood preference shifts are robust, they do not transfer to decoded generation: expanded evaluations on both non-null models yield null results across all generation metrics. Group-level expert masking reveals why: bias and core knowledge are deeply entangled within…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.