MoEcho: Exploiting Side-Channel Attacks to Compromise User Privacy in Mixture-of-Experts LLMs
Ruyi Ding, Tianhong Xu, Xinyi Shen, Aidong Adam Ding, Yunsi Fei

TL;DR
MoEcho reveals that side-channel attacks exploiting hardware traces can compromise user privacy in Mixture-of-Experts large language models, highlighting a critical security vulnerability in modern transformer architectures.
Contribution
This work introduces MoEcho, the first to analyze and demonstrate side-channel vulnerabilities in MoE architectures at runtime, proposing four novel attack methods across different hardware platforms.
Findings
Successfully breached user privacy in MoE-based models
Developed four hardware-specific side-channel attacks
Highlighted urgent need for security safeguards in MoE models
Abstract
The transformer architecture has become a cornerstone of modern AI, fueling remarkable progress across applications in natural language processing, computer vision, and multimodal learning. As these models continue to scale explosively for performance, implementation efficiency remains a critical challenge. Mixture of Experts (MoE) architectures, selectively activating specialized subnetworks (experts), offer a unique balance between model accuracy and computational cost. However, the adaptive routing in MoE architectures, where input tokens are dynamically directed to specialized experts based on their semantic meaning inadvertently opens up a new attack surface for privacy breaches. These input-dependent activation patterns leave distinctive temporal and spatial traces in hardware execution, which adversaries could exploit to deduce sensitive user data. In this work, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Adversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices
