On the Robustness of Latent Diffusion Models
Jianping Zhang, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weibin Wu,, Michael R. Lyu

TL;DR
This paper thoroughly analyzes the robustness of latent diffusion models against various attacks, evaluates their vulnerabilities in different scenarios, and provides a new benchmark dataset for future research.
Contribution
It offers a comprehensive robustness analysis of latent diffusion models, including white-box and black-box scenarios, and introduces a dataset for benchmarking defenses.
Findings
Identifies factors affecting model robustness.
Evaluates transfer attack effectiveness.
Provides a new dataset for robustness testing.
Abstract
Latent diffusion models achieve state-of-the-art performance on a variety of generative tasks, such as image synthesis and image editing. However, the robustness of latent diffusion models is not well studied. Previous works only focus on the adversarial attacks against the encoder or the output image under white-box settings, regardless of the denoising process. Therefore, in this paper, we aim to analyze the robustness of latent diffusion models more thoroughly. We first study the influence of the components inside latent diffusion models on their white-box robustness. In addition to white-box scenarios, we evaluate the black-box robustness of latent diffusion models via transfer attacks, where we consider both prompt-transfer and model-transfer settings and possible defense mechanisms. However, all these explorations need a comprehensive benchmark dataset, which is missing in the…
Peer Reviews
Decision·Submitted to ICLR 2024
1.The paper addresses the robustness of latent diffusion models, which is highly significant. 2.The paper explores the structure of diffusion models beyond attacking the encoder and discusses black-box attack settings. 3.The proposed attack pipeline is simple and easy to follow.
1. Lack of comparison with other types of diffusion models。The experiments in the paper primarily focus on different versions of Stable Diffusion, lacking comparisons and discussions regarding other types of diffusion models. This may not comprehensively represent all latent diffusion models. I suggest the authors consider comparing with other models, such as UniDiffuser[1]. 2. Limited exploration of diverse attacks and defenses。As the paper aims to explore the robustness of diffusion models fr
1. Robustness Evaluation: The paper excels in providing a thorough investigation of the robustness of LDMs, addressing both white-box and black-box adversarial attacks. This dual perspective enriches the paper's contributions and sets a solid foundation for future research in this domain. 2. Dataset Construction: The introduction of an automatic dataset construction pipeline is a noteworthy contribution, as it streamlines the process of evaluating LDMs and ensures a consistent and reproducible
1. Methodological Clarity: The paper could benefit from a more explicit elucidation of the adversarial attack strategies employed. Diving deeper into the rationale behind each attack, the expected impacts, and the choice of specific models would significantly enhance the reader's understanding and the paper's overall impact. 2. Limited Scope: Focusing solely on LDMs narrows the breadth of the paper's contributions. Expanding the analysis to include comparisons with other models could provide a
- This paper aims to explore the robustness of LDM, which is an important problem in the age of large generative models - The idea of decomposing the DM into sub-modules is good, by exploring each module, we can get some new insights - The authors did extensive experiments to support their conclusions - The paper is well-written and is easy to read
- For the white box settings (based on gradient): (1) Many attacks on LDM have been studied [1, 2], but are not mentioned in this paper. (2) Attacking the output of the encoder is not an optimal way, previous work tried to minimize the distance between encoded adv-samples and a target image (e.g. some noise or given image) [2] (3) Will the combination of the objective functions of different sub-modules be a stronger attack? This point is not discussed - For the black box settings: s
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
MethodsDiffusion · Focus
