Multi-Trigger Poisoning Amplifies Backdoor Vulnerabilities in LLMs

Sanhanat Sivapiromrat; Caiqi Zhang; Marco Basaldella; Nigel Collier

arXiv:2507.11112·cs.CL·October 10, 2025

Multi-Trigger Poisoning Amplifies Backdoor Vulnerabilities in LLMs

Sanhanat Sivapiromrat, Caiqi Zhang, Marco Basaldella, Nigel Collier

PDF

Open Access

TL;DR

This paper reveals that multiple backdoor triggers can coexist and persist in LLMs, increasing vulnerability, and proposes a targeted retraining method to effectively mitigate multi-trigger poisoning attacks.

Contribution

It introduces a framework for understanding multi-trigger poisoning in LLMs and proposes a layer-wise retraining defense to remove embedded triggers efficiently.

Findings

01

Multiple triggers can coexist without interference.

02

High similarity triggers remain active despite token substitutions.

03

Proposed retraining method effectively removes triggers with minimal updates.

Abstract

Recent studies have shown that Large Language Models (LLMs) are vulnerable to data poisoning attacks, where malicious training examples embed hidden behaviours triggered by specific input patterns. However, most existing works assume a phrase and focus on the attack's effectiveness, offering limited understanding of trigger mechanisms and how multiple triggers interact within the model. In this paper, we present a framework for studying poisoning in LLMs. We show that multiple distinct backdoor triggers can coexist within a single model without interfering with each other, enabling adversaries to embed several triggers concurrently. Using multiple triggers with high embedding similarity, we demonstrate that poisoned triggers can achieve robust activation even when tokens are substituted or separated by long token spans. Our findings expose a broader and more persistent vulnerability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Grid Security and Resilience · Software System Performance and Reliability · Blockchain Technology Applications and Security

MethodsHigh-Order Consensuses · Focus