BadBlocks: Lightweight and Stealthy Backdoor Threat in Text-to-Image Diffusion Models

Yu Pan; Jiahao Chen; Wenjie Wang; Bingrong Dai; Junjun Yang

arXiv:2508.03221·cs.CR·January 1, 2026

BadBlocks: Lightweight and Stealthy Backdoor Threat in Text-to-Image Diffusion Models

Yu Pan, Jiahao Chen, Wenjie Wang, Bingrong Dai, Junjun Yang

PDF

TL;DR

BadBlocks introduces a lightweight, stealthy backdoor attack method for text-to-image diffusion models that is highly efficient, effective, and capable of evading current defenses, lowering the barrier for malicious manipulation.

Contribution

It presents BadBlocks, a novel backdoor technique that contaminates specific UNet blocks with minimal computation, enabling stealthy attacks on diffusion models.

Findings

01

Achieves high attack success with minimal perceptual degradation.

02

Requires only 30% of the computation and 20% of GPU time of prior methods.

03

Effectively evades state-of-the-art defenses, especially attention-based detection.

Abstract

Diffusion models have recently achieved remarkable success in image generation, yet growing evidence shows their vulnerability to backdoor attacks, where adversaries implant covert triggers to manipulate outputs. While existing defenses can detect many such attacks via visual inspection and neural network-based analysis, we identify a more lightweight and stealthy threat, termed BadBlocks. BadBlocks selectively contaminates specific blocks within the UNet architecture while preserving the normal behavior of the remaining components. Compared with prior methods, it requires only about 30% of the computation and 20% of the GPU time, yet achieves high attack success rates with minimal perceptual degradation. Extensive experiments demonstrate that BadBlocks can effectively evade state-of-the-art defenses, particularly attention-based detection frameworks. Ablation studies further reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.