Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Gorka Abad; Ermes Franch; Stefanos Koffas; Stjepan Picek

arXiv:2603.09772·cs.CV·March 11, 2026

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Gorka Abad, Ermes Franch, Stefanos Koffas, Stjepan Picek

PDF

Open Access

TL;DR

This paper reveals that backdoors can be activated by alternative triggers distinct from the original, suggesting defenses should focus on feature-space directions rather than input triggers.

Contribution

It introduces the concept of alternative triggers, proves their existence theoretically, and demonstrates their practical impact on backdoor defenses.

Findings

01

Alternative triggers reliably activate the same backdoor.

02

Removing training triggers often fails to eliminate backdoors.

03

Backdoors can be exploited via latent feature-space directions.

Abstract

Current backdoor defenses assume that neutralizing a known trigger removes the backdoor. We show this trigger-centric view is incomplete: \emph{alternative triggers}, patterns perceptually distinct from training triggers, reliably activate the same backdoor. We estimate the alternative trigger backdoor direction in feature space by contrasting clean and triggered representations, and then develop a feature-guided attack that jointly optimizes target prediction and directional alignment. First, we theoretically prove that alternative triggers exist and are an inevitable consequence of backdoor training. Then, we verify this empirically. Additionally, defenses that remove training triggers often leave backdoors intact, and alternative triggers can exploit the latent backdoor feature-space. Our findings motivate defenses targeting backdoor directions in representation space rather than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis