Espresso: Robust Concept Filtering in Text-to-Image Models

Anudeep Das; Vasisht Duddu; Rui Zhang; N. Asokan

arXiv:2404.19227·cs.CV·February 27, 2025

Espresso: Robust Concept Filtering in Text-to-Image Models

Anudeep Das, Vasisht Duddu, Rui Zhang, N. Asokan

PDF

Open Access

TL;DR

Espresso is a novel CLIP-based concept filtering method for text-to-image models that effectively prevents unacceptable image concepts while maintaining utility and robustness against adversarial prompts.

Contribution

We introduce Espresso, the first robust concept filter based on CLIP that improves effectiveness, robustness, and utility preservation in concept removal for text-to-image models.

Findings

01

Espresso outperforms prior CRTs in effectiveness.

02

Espresso demonstrates higher robustness against adversarial prompts.

03

Espresso maintains high utility for acceptable concepts.

Abstract

Diffusion based text-to-image models are trained on large datasets scraped from the Internet, potentially containing unacceptable concepts (e.g., copyright-infringing or unsafe). We need concept removal techniques (CRTs) which are i) effective in preventing the generation of images with unacceptable concepts, ii) utility-preserving on acceptable concepts, and, iii) robust against evasion with adversarial prompts. No prior CRT satisfies all these requirements simultaneously. We introduce Espresso, the first robust concept filter based on Contrastive Language-Image Pre-Training (CLIP). We identify unacceptable concepts by using the distance between the embedding of a generated image to the text embeddings of both unacceptable and acceptable concepts. This lets us fine-tune for robustness by separating the text embeddings of unacceptable and acceptable concepts while preserving utility. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Image Retrieval and Classification Techniques · Machine Learning and Data Classification

MethodsContrastive Language-Image Pre-training