MMA-Diffusion: MultiModal Attack on Diffusion Models

Yijun Yang; Ruiyuan Gao; Xiaosen Wang; Tsung-Yi Ho; Nan Xu; Qiang Xu

arXiv:2311.17516·cs.CR·April 2, 2024·1 cites

MMA-Diffusion: MultiModal Attack on Diffusion Models

Yijun Yang, Ruiyuan Gao, Xiaosen Wang, Tsung-Yi Ho, Nan Xu, Qiang Xu

PDF

Open Access 2 Repos 2 Datasets

TL;DR

MMA-Diffusion demonstrates a novel multi-modal attack method that effectively bypasses safety measures in Text-to-Image models, revealing significant security vulnerabilities in current defenses for both open-source and commercial systems.

Contribution

The paper introduces MMA-Diffusion, a multi-modal attack framework that circumvents existing safety mechanisms in T2I models, exposing critical security flaws.

Findings

01

Successfully bypasses prompt filters and safety checkers

02

Reveals vulnerabilities in open-source and commercial T2I models

03

Highlights need for improved safety defenses

Abstract

In recent years, Text-to-Image (T2I) models have seen remarkable advancements, gaining widespread adoption. However, this progress has inadvertently opened avenues for potential misuse, particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion, a framework that presents a significant and realistic threat to the security of T2I models by effectively circumventing current defensive measures in both open-source models and commercial online services. Unlike previous approaches, MMA-Diffusion leverages both textual and visual modalities to bypass safeguards like prompt filters and post-hoc safety checkers, thus exposing and highlighting the vulnerabilities in existing defense mechanisms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Security and Verification in Computing