Layer-wise MoE Routing Locality under Shared-Prefix Code Generation: Token-Identity Decomposition and Compile-Equivalent Fork Redundancy
Shun-ichiro Hayashi, Daichi Mukunoki, Tetsuya Hoshino, Takahiro Katagiri

TL;DR
This paper investigates how Mixture-of-Experts routing overlaps in large language models during code generation, revealing layer-wise patterns and implications for search efficiency.
Contribution
It provides a detailed layer-wise analysis of expert routing similarity during code generation, challenging prior assumptions and highlighting opportunities for optimization.
Findings
High routing similarity at same-token positions (up to 0.649 Jaccard)
Layer-wise crossing pattern in routing similarity, with dips in middle layers
Most successful codes are concentrated in top assembly groups, with minimal within-group differences
Abstract
In LLM-based code generation, multiple code candidates are often generated in parallel from the same prompt -- for example, in best-of-N sampling or multi-candidate code completion. These requests can share KV caches through a common prefix, yet the extent to which their Mixture-of-Experts (MoE) expert routing overlaps, and how this overlap varies across layers, remains insufficiently understood. We study Qwen3.5-35B-A3B-FP8 (256 routed experts, top-8) by performing tree-search-based branching generation from a shared prefix (851 completed codes, temperature 0.7) and analyzing the results with a compiler-output-based alignment (gcc -S -O0 assembly) that controls for token-identity confounds. Our findings are threefold: (1) At positions where both sequences generated the same token, Jaccard similarity reaches 0.649 (40x random), while even at positions with different tokens it remains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
