Stepwise Reasoning Error Disruption Attack of LLMs
Jingyu Peng, Maolin Wang, Xiangyu Zhao, Kai Zhang, Wanyu Wang, Pengyue Jia, Qidong Liu, Ruocheng Guo, Qi Liu

TL;DR
This paper introduces SEED, a novel attack method that subtly disrupts the reasoning process of large language models by injecting errors into prior steps, exposing vulnerabilities in their reasoning robustness.
Contribution
The paper presents SEED, a zero-shot compatible attack that maintains natural reasoning flow and demonstrates effectiveness across multiple datasets and models.
Findings
SEED successfully disrupts LLM reasoning in various settings.
Vulnerabilities in LLM reasoning processes are revealed.
SEED is effective without modifying instructions.
Abstract
Large language models (LLMs) have made remarkable strides in complex reasoning tasks, but their safety and robustness in reasoning processes remain underexplored. Existing attacks on LLM reasoning are constrained by specific settings or lack of imperceptibility, limiting their feasibility and generalizability. To address these challenges, we propose the Stepwise rEasoning Error Disruption (SEED) attack, which subtly injects errors into prior reasoning steps to mislead the model into producing incorrect subsequent reasoning and final answers. Unlike previous methods, SEED is compatible with zero-shot and few-shot settings, maintains the natural reasoning flow, and ensures covert execution without modifying the instruction. Extensive experiments on four datasets across four different models demonstrate SEED's effectiveness, revealing the vulnerabilities of LLMs to disruptions in reasoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAccess Control and Trust · Security and Verification in Computing · Cloud Data Security Solutions
MethodsSoftmax · Attention Is All You Need
