On the Robustness of Latent Diffusion Models

Jianping Zhang; Zhuoer Xu; Shiwen Cui; Changhua Meng; Weibin Wu,; Michael R. Lyu

arXiv:2306.08257·cs.CV·June 16, 2023·6 cites

On the Robustness of Latent Diffusion Models

Jianping Zhang, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weibin Wu,, Michael R. Lyu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper thoroughly analyzes the robustness of latent diffusion models against various attacks, evaluates their vulnerabilities in different scenarios, and provides a new benchmark dataset for future research.

Contribution

It offers a comprehensive robustness analysis of latent diffusion models, including white-box and black-box scenarios, and introduces a dataset for benchmarking defenses.

Findings

01

Identifies factors affecting model robustness.

02

Evaluates transfer attack effectiveness.

03

Provides a new dataset for robustness testing.

Abstract

Latent diffusion models achieve state-of-the-art performance on a variety of generative tasks, such as image synthesis and image editing. However, the robustness of latent diffusion models is not well studied. Previous works only focus on the adversarial attacks against the encoder or the output image under white-box settings, regardless of the denoising process. Therefore, in this paper, we aim to analyze the robustness of latent diffusion models more thoroughly. We first study the influence of the components inside latent diffusion models on their white-box robustness. In addition to white-box scenarios, we evaluate the black-box robustness of latent diffusion models via transfer attacks, where we consider both prompt-transfer and model-transfer settings and possible defense mechanisms. However, all these explorations need a comprehensive benchmark dataset, which is missing in the…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

1.The paper addresses the robustness of latent diffusion models, which is highly significant. 2.The paper explores the structure of diffusion models beyond attacking the encoder and discusses black-box attack settings. 3.The proposed attack pipeline is simple and easy to follow.

Weaknesses

1. Lack of comparison with other types of diffusion models。The experiments in the paper primarily focus on different versions of Stable Diffusion, lacking comparisons and discussions regarding other types of diffusion models. This may not comprehensively represent all latent diffusion models. I suggest the authors consider comparing with other models, such as UniDiffuser[1]. 2. Limited exploration of diverse attacks and defenses。As the paper aims to explore the robustness of diffusion models fr

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

1. Robustness Evaluation: The paper excels in providing a thorough investigation of the robustness of LDMs, addressing both white-box and black-box adversarial attacks. This dual perspective enriches the paper's contributions and sets a solid foundation for future research in this domain. 2. Dataset Construction: The introduction of an automatic dataset construction pipeline is a noteworthy contribution, as it streamlines the process of evaluating LDMs and ensures a consistent and reproducible

Weaknesses

1. Methodological Clarity: The paper could benefit from a more explicit elucidation of the adversarial attack strategies employed. Diving deeper into the rationale behind each attack, the expected impacts, and the choice of specific models would significantly enhance the reader's understanding and the paper's overall impact. 2. Limited Scope: Focusing solely on LDMs narrows the breadth of the paper's contributions. Expanding the analysis to include comparisons with other models could provide a

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- This paper aims to explore the robustness of LDM, which is an important problem in the age of large generative models - The idea of decomposing the DM into sub-modules is good, by exploring each module, we can get some new insights - The authors did extensive experiments to support their conclusions - The paper is well-written and is easy to read

Weaknesses

- For the white box settings (based on gradient): (1) Many attacks on LDM have been studied [1, 2], but are not mentioned in this paper. (2) Attacking the output of the encoder is not an optimal way, previous work tried to minimize the distance between encoded adv-samples and a target image (e.g. some noise or given image) [2] (3) Will the combination of the objective functions of different sub-modules be a stronger attack? This point is not discussed - For the black box settings: s

Code & Models

Repositories

jpzhang1810/ldm-robustness
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks

MethodsDiffusion · Focus