When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

Ziyu Liu; Tao Li; Tianjie Ni; Xiaolong Lan; Wengang Ma; Tao Yang; Guohua Wang; Junjiang He

arXiv:2605.11612·cs.CL·May 13, 2026

When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

Ziyu Liu, Tao Li, Tianjie Ni, Xiaolong Lan, Wengang Ma, Tao Yang, Guohua Wang, Junjiang He

PDF

TL;DR

This paper introduces Paraesthesia, a novel emotion-style dynamic backdoor attack on large language models that leverages emotional cues as triggers, achieving high success rates while remaining stealthy.

Contribution

It proposes a new backdoor attack method using emotional style as a trigger, which is more covert and effective than static token-based triggers.

Findings

01

Achieves around 99% attack success rate across multiple models and tasks.

02

Maintains the utility of models on clean data.

03

Demonstrates emotional style as an effective backdoor trigger.

Abstract

Backdoor vulnerabilities widely exist in the fine-tuning of large language models(LLMs). Most backdoor poisoning methods operate mainly at the token level and lack deeper semantic manipulation, which limits stealthiness. In addition, Prior attacks rely on a single fixed trigger to induce harmful outputs. Such static triggers are easy to detect, and clean fine-tuning can weaken the trigger-target association. Through causal validation, we observe that emotion is not directly linked to individual words, but functions as an overall stylistic factor through tone. In the representation space of LLM, emotion can be decoupled from semantics, forming distinct cluster from the original neutral text. Therefore, we consider the emotional factor as the backdoor trigger to propose a pparasitic emotion-style dynamic backdoor attack, Paraesthesia. By mixing samples with the emotional trigger into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.