Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference

Fan Yang

arXiv:2603.10091·cs.CR·March 12, 2026

Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference

Fan Yang

PDF

Open Access

TL;DR

This paper introduces a novel multi-stream perturbation attack that interleaves multiple task streams to exploit vulnerabilities in thinking LLMs, effectively bypassing safety measures and causing process collapse.

Contribution

It proposes a new attack method using concurrent task interference with three perturbation strategies, revealing a new security risk in thinking LLMs.

Findings

01

Achieves high attack success rates on multiple datasets and models.

02

Causes thinking process collapse and repetitive outputs.

03

Outperforms existing methods in bypassing safety mechanisms.

Abstract

The widespread adoption of thinking mode in large language models (LLMs) has significantly enhanced complex task processing capabilities while introducing new security risks. When subjected to jailbreak attacks, the step-by-step reasoning process may cause models to generate more detailed harmful content. We observe that thinking mode exhibits unique vulnerabilities when processing interleaved multiple tasks. Based on this observation, we propose multi-stream perturbation attack, which generates superimposed interference by interweaving multiple task streams within a single prompt. We design three perturbation strategies: multi-stream interleaving, inversion perturbation, and shape transformation, which disrupt the thinking process through concurrent task interleaving, character reversal, and format constraints respectively. On JailbreakBench, AdvBench, and HarmBench datasets, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Malware Detection Techniques