Evaluating Robustness in Latent Diffusion Models via Embedding Level Augmentation
Boris Martirosyan, Alexey Karmanov

TL;DR
This paper investigates the robustness of latent diffusion models by introducing embedding-level augmentation techniques, fine-tuning models, and proposing a new evaluation pipeline to better understand their vulnerabilities.
Contribution
It introduces novel data augmentation methods and a dedicated evaluation pipeline to assess and improve the robustness of latent diffusion models.
Findings
Augmentation techniques reveal robustness shortcomings in LDMs.
Fine-tuning with augmentation improves model robustness.
New evaluation pipeline effectively measures robustness.
Abstract
Latent diffusion models (LDMs) achieve state-of-the-art performance across various tasks, including image generation and video synthesis. However, they generally lack robustness, a limitation that remains not fully explored in current research. In this paper, we propose several methods to address this gap. First, we hypothesize that the robustness of LDMs primarily should be measured without their text encoder, because if we take and explore the whole architecture, the problems of image generator and text encoders wll be fused. Second, we introduce novel data augmentation techniques designed to reveal robustness shortcomings in LDMs when processing diverse textual prompts. We then fine-tune Stable Diffusion 3 and Stable Diffusion XL models using Dreambooth, incorporating these proposed augmentation methods across multiple tasks. Finally, we propose a novel evaluation pipeline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks
