UFID: A Unified Framework for Input-level Backdoor Detection on   Diffusion Models

Zihan Guan; Mengxuan Hu; Sheng Li; Anil Vullikanti

arXiv:2404.01101·cs.CR·February 5, 2025·1 cites

UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models

Zihan Guan, Mengxuan Hu, Sheng Li, Anil Vullikanti

PDF

Open Access 1 Repo

TL;DR

This paper introduces UFID, a black-box input-level backdoor detection framework for diffusion models, addressing unique challenges in generative tasks and demonstrating high effectiveness and efficiency through extensive experiments.

Contribution

The paper presents a novel causal analysis-based framework for detecting backdoors in diffusion models, specifically designed for black-box inference scenarios.

Findings

01

High detection accuracy across multiple datasets

02

Effective in both conditional and unconditional diffusion models

03

Fast run-time performance

Abstract

Diffusion models are vulnerable to backdoor attacks, where malicious attackers inject backdoors by poisoning certain training samples during the training stage. This poses a significant threat to real-world applications in the Model-as-a-Service (MaaS) scenario, where users query diffusion models through APIs or directly download them from the internet. To mitigate the threat of backdoor attacks under MaaS, black-box input-level backdoor detection has drawn recent interest, where defenders aim to build a firewall that filters out backdoor samples in the inference stage, with access only to input queries and the generated results from diffusion models. Despite some preliminary explorations on the traditional classification tasks, these methods cannot be directly applied to the generative tasks due to two major challenges: (1) more diverse failures and (2) a multi-modality attack surface.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guanzihan/official_ufid
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks

MethodsFocus · Diffusion