Seed Hijacking of LLM Sampling and Quantum Random Number Defense
Ziyang You, Xiaoke Yang, Zhanling Fan, Feng Guo, Xiaogen Zhou, Xuxing Lu

TL;DR
This paper uncovers a vulnerability in LLM sampling due to PRNG seed manipulation and proposes a QRNG-based defense that effectively neutralizes the attack with minimal overhead.
Contribution
It introduces SeedHijack, a backdoor attack on LLM sampling, and presents a practical QRNG-based defense to mitigate this security risk.
Findings
99.6% success rate in token injection on GPT-2
Achieves 100% success on larger, aligned models
QRNG defense neutralizes the attack with minimal latency and memory overhead
Abstract
Large language models (LLMs) rely on deterministic pseudorandom number generators (PRNGs) for autoregressive sampling, creating a critical supply-chain attack surface overlooked by existing defenses. We present SeedHijack, a backdoor attack that manipulates PRNG outputs to force attacker-specified token selection without altering model logits. In a 540-trial benchmark on GPT-2 (124M), the attack achieves 99.6% exact token injection across 9 sampling configurations; it reaches 100% success on four aligned models (1.5B-7B, RLHF/SFT/reasoning distillation) and bypasses all alignment methods tested in this work. We further propose a defense based on a hardware quantum random number generator (QRNG), which neutralizes the attack in our evaluated threat model with negligible median overhead (+0.6% latency, +7.7 MB memory). Our work identifies a critical sampling-layer vulnerability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
