Discovering Failure Modes of Text-guided Diffusion Models via   Adversarial Search

Qihao Liu; Adam Kortylewski; Yutong Bai; Song Bai; and Alan Yuille

arXiv:2306.00974·cs.CV·December 1, 2023·2 cites

Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search

Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, and Alan Yuille

PDF

Open Access

TL;DR

This paper introduces SAGE, an adversarial search method that systematically uncovers failure modes in text-guided diffusion models by exploring prompt and latent spaces, revealing issues like semantic inaccuracies and misalignments.

Contribution

We propose SAGE, the first adversarial search technique for TDMs, enabling automatic discovery of failure cases in both prompt and latent spaces, validated through human inspection.

Findings

01

Identified prompts that produce images with incorrect semantics.

02

Discovered regions in latent space leading to distorted images.

03

Found latent samples causing unrelated, natural-looking images.

Abstract

Text-guided diffusion models (TDMs) are widely applied but can fail unexpectedly. Common failures include: (i) natural-looking text prompts generating images with the wrong content, or (ii) different random samples of the latent variables that generate vastly different, and even unrelated, outputs despite being conditioned on the same text prompt. In this work, we aim to study and understand the failure modes of TDMs in more detail. To achieve this, we propose SAGE, the first adversarial search method on TDMs that systematically explores the discrete prompt space and the high-dimensional latent space, to automatically discover undesirable behaviors and failure cases in image generation. We use image classifiers as surrogate loss functions during searching, and employ human inspections to validate the identified failures. For the first time, our method enables efficient exploration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Topic Modeling · Generative Adversarial Networks and Image Synthesis

Methodsfail · Diffusion · Contrastive Language-Image Pre-training