On the Multi-modal Vulnerability of Diffusion Models

Dingcheng Yang; Yang Bai; Xiaojun Jia; Yang Liu; Xiaochun Cao; Wenjian; Yu

arXiv:2402.01369·cs.LG·January 6, 2025·1 cites

On the Multi-modal Vulnerability of Diffusion Models

Dingcheng Yang, Yang Bai, Xiaojun Jia, Yang Liu, Xiaochun Cao, Wenjian, Yu

PDF

Open Access 1 Repo

TL;DR

This paper investigates the vulnerabilities of diffusion models in multi-modal settings by visualizing feature spaces and proposing MMP-Attack, a method that manipulates generated images by leveraging multi-modal priors to alter objects in the output.

Contribution

It is the first to analyze joint text-image feature space in diffusion models and introduces MMP-Attack, a novel multi-modal attack method that improves manipulation effectiveness and efficiency.

Findings

01

Text and image features are embedded chaotically and cluster differently in diffusion models.

02

MMP-Attack effectively manipulates diffusion outputs by appending specific suffixes to prompts.

03

The proposed method outperforms existing attacks in manipulation success and efficiency.

Abstract

Diffusion models have been widely deployed in various image generation tasks, demonstrating an extraordinary connection between image and text modalities. Although prior studies have explored the vulnerability of diffusion models from the perspectives of text and image modalities separately, the current research landscape has not yet thoroughly investigated the vulnerabilities that arise from the integration of multiple modalities, specifically through the joint analysis of textual and visual features. In this paper, we are the first to visualize both text and image feature space embedded by diffusion models and observe a significant difference. The prompts are embedded chaotically in the text feature space, while in the image feature space they are clustered according to their subjects. These fascinating findings may underscore a potential misalignment in robustness between the two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ydc123/mmp-attack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection

MethodsDiffusion · Focus