ROMER: Expert Replacement and Router Calibration for Robust MoE LLMs on Analog Compute-in-Memory Systems

Wenyong Zhou; Yuannuo Feng; Yizhe Chen; Taiqiang Wu; Wendong Xu; Wenbo Qi; Zhengwu Liu; Wang Kang; Ngai Wong

arXiv:2605.11800·cs.LG·May 13, 2026

ROMER: Expert Replacement and Router Calibration for Robust MoE LLMs on Analog Compute-in-Memory Systems

Wenyong Zhou, Yuannuo Feng, Yizhe Chen, Taiqiang Wu, Wendong Xu, Wenbo Qi, Zhengwu Liu, Wang Kang, Ngai Wong

PDF

TL;DR

This paper introduces ROMER, a calibration framework for MoE LLMs on analog CIM hardware, addressing noise-induced load imbalance and routing issues to improve model performance.

Contribution

ROMER is the first systematic calibration method that compensates for hardware noise in MoE LLMs on analog CIM systems, enhancing robustness and efficiency.

Findings

01

ROMER reduces perplexity by up to 59.8% under real-chip noise.

02

Hardware noise disrupts expert load balance and routing decisions.

03

ROMER improves model robustness across multiple MoE architectures.

Abstract

Large language models (LLMs) with mixture-of-experts (MoE) architectures achieve remarkable scalability by sparsely activating a subset of experts per token, yet their frequent expert switching creates memory bandwidth bottlenecks that compute-in-memory (CIM) architectures are well-suited to mitigate. However, analog CIM systems suffer from inherent hardware imperfections that perturb stored weights, and its negative impact on MoE-based LLMs in noisy CIM environments remains unexplored. In this work, we present the first systematic investigation of MoE-based LLMs under noise model calibrated with real chip measurements, revealing that hardware noise critically disrupts expert load balance and renders clean-trained routing decisions consistently suboptimal. Based on these findings, we propose ROMER, a post-training calibration framework that (1) replaces underactivated experts with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.