RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

Ruixuan Huang; Qingyue Wang; Hantao Huang; Yudong Gao; Dong Chen; Shuai Wang; Wei Wang

arXiv:2512.23995·cs.CR·January 1, 2026

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

Ruixuan Huang, Qingyue Wang, Hantao Huang, Yudong Gao, Dong Chen, Shuai Wang, Wei Wang

PDF

Open Access

TL;DR

This paper uncovers a vulnerability in Mixture-of-Experts large language models where adversarial prompts cause severe load imbalance, leading to increased latency and potential denial-of-service, and introduces RepetitionCurse to exploit this flaw.

Contribution

The paper identifies a universal routing flaw in MoE models and proposes RepetitionCurse, a black-box attack method that exploits this vulnerability across different models.

Findings

01

Adversarial prompts cause routing imbalance in MoE models.

02

RepetitionCurse increases inference latency by over 3x.

03

The vulnerability can be exploited in a model-agnostic manner.

Abstract

Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To accommodate the growing number of experts in practice, modern inference systems commonly adopt expert parallelism to distribute experts across devices. However, the absence of explicit load balancing constraints during inference allows adversarial inputs to trigger severe routing concentration. We demonstrate that out-of-distribution prompts can manipulate the routing strategy such that all tokens are consistently routed to the same set of top- $k$ experts, which creates computational bottlenecks on certain devices while forcing others to idle. This converts an efficiency mechanism into a denial-of-service attack vector, leading to violations of service-level agreements for time to first token. We propose RepetitionCurse, a low-cost black-box strategy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Graph Neural Networks · Privacy-Preserving Technologies in Data