Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

Yiyang Lu; Jinwen He; Yue Zhao; Kai Chen; Ruigang Liang

arXiv:2601.14340·cs.CR·January 22, 2026

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

Yiyang Lu, Jinwen He, Yue Zhao, Kai Chen, Ruigang Liang

PDF

Open Access

TL;DR

This paper introduces a novel backdoor attack on multi-turn LLMs that exploits dialogue structure, specifically turn indices, achieving high success rates and highlighting the need for structure-aware defenses.

Contribution

The paper presents Turn-based Structural Trigger (TST), a new backdoor attack leveraging dialogue turn structure, which is effective across models and datasets with minimal utility loss.

Findings

01

TST achieves an average attack success rate of 99.52% across models.

02

TST remains effective under five defenses with an average ASR of 98.04%.

03

The attack generalizes well across different instruction datasets.

Abstract

Large Language Models (LLMs) are widely integrated into interactive systems such as dialogue agents and task-oriented assistants. This growing ecosystem also raises supply-chain risks, where adversaries can distribute poisoned models that degrade downstream reliability and user trust. Existing backdoor attacks and defenses are largely prompt-centric, focusing on user-visible triggers while overlooking structural signals in multi-turn conversations. We propose Turn-based Structural Trigger (TST), a backdoor attack that activates from dialogue structure, using the turn index as the trigger and remaining independent of user inputs. Across four widely used open-source LLM models, TST achieves an average attack success rate (ASR) of 99.52% with minimal utility degradation, and remains effective under five representative defenses with an average ASR of 98.04%. The attack also generalizes well…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Natural Language Processing Techniques