UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against   Both Textual Filters and Visual Checkers

Duo Peng; Qiuhong Ke; Jun Liu

arXiv:2405.11336·cs.CV·May 28, 2024

UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers

Duo Peng, Qiuhong Ke, Jun Liu

PDF

Open Access

TL;DR

UPAM is a new attack framework that challenges both textual and visual defenses in text-to-image models, using gradient optimization and novel learning schemes to generate targeted, stealthy, and effective harmful image outputs.

Contribution

The paper introduces UPAM, a unified attack method that effectively deceives both textual and visual defenses in T2I models with novel gradient-based techniques.

Findings

01

UPAM outperforms previous attack methods in effectiveness.

02

UPAM maintains high attack success even when models do not return results.

03

The framework demonstrates high efficiency and stealthiness in attacks.

Abstract

Text-to-Image (T2I) models have raised security concerns due to their potential to generate inappropriate or harmful images. In this paper, we propose UPAM, a novel framework that investigates the robustness of T2I models from the attack perspective. Unlike most existing attack methods that focus on deceiving textual defenses, UPAM aims to deceive both textual and visual defenses in T2I models. UPAM enables gradient-based optimization, offering greater effectiveness and efficiency than previous methods. Given that T2I models might not return results due to defense mechanisms, we introduce a Sphere-Probing Learning (SPL) scheme to support gradient optimization even when no results are returned. Additionally, we devise a Semantic-Enhancing Learning (SEL) scheme to finetune UPAM for generating target-aligned images. Our framework also ensures attack stealthiness. Extensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Steganography and Watermarking Techniques · Digital Media Forensic Detection

MethodsFocus