Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling
Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li, Lizhi Xiong, Zhangjie Fu

TL;DR
This paper introduces SET, a novel input-level backdoor detection method for text-to-image diffusion models that exploits differences in response patterns under cross-attention scaling, outperforming existing defenses especially against stealthy triggers.
Contribution
The work uncovers the Cross-Attention Scaling Response Divergence phenomenon and develops SET, a trigger-agnostic detection framework that learns a benign response space for robust backdoor detection.
Findings
SET outperforms existing methods across diverse attack scenarios.
Achieves 9.1% higher AUROC and 6.5% higher ACC than the best baseline.
Effective against stealthy, implicit-trigger backdoor attacks.
Abstract
Text-to-image (T2I) diffusion models have achieved remarkable success in image synthesis, but their reliance on large-scale data and open ecosystems introduces serious backdoor security risks. Existing defenses, particularly input-level methods, are more practical for deployment but often rely on observable anomalies that become unreliable under stealthy, semantics-preserving trigger designs. As modern backdoor attacks increasingly embed triggers into natural inputs, these methods degrade substantially, raising a critical question: can more stable, implicit, and trigger-agnostic differences between benign and backdoor inputs be exploited for detection? In this work, we address this challenge from an active probing perspective. We introduce controlled scaling perturbations on cross-attention and uncover a novel phenomenon termed Cross-Attention Scaling Response Divergence (CSRD), where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
