Unified Prompt Attack Against Text-to-Image Generation Models
Duo Peng, Qiuhong Ke, Mark He Huang, Ping Hu, Jun Liu

TL;DR
This paper introduces UPAM, a comprehensive framework for evaluating and attacking text-to-image models by unifying textual and visual defenses, improving efficiency, and ensuring semantic and naturalness considerations.
Contribution
UPAM is the first unified attack framework that combines gradient-based optimization, defense bypassing, semantic control, naturalness enhancement, and low-query transferability for T2I models.
Findings
UPAM effectively bypasses defenses and induces harmful image generation.
It outperforms prior methods in attack success rate and efficiency.
UPAM produces natural and semantically aligned adversarial prompts.
Abstract
Text-to-Image (T2I) models have advanced significantly, but their growing popularity raises security concerns due to their potential to generate harmful images. To address these issues, we propose UPAM, a novel framework to evaluate the robustness of T2I models from an attack perspective. Unlike prior methods that focus solely on textual defenses, UPAM unifies the attack on both textual and visual defenses. Additionally, it enables gradient-based optimization, overcoming reliance on enumeration for improved efficiency and effectiveness. To handle cases where T2I models block image outputs due to defenses, we introduce Sphere-Probing Learning (SPL) to enable optimization even without image results. Following SPL, our model bypasses defenses, inducing the generation of harmful content. To ensure semantic alignment with attacker intent, we propose Semantic-Enhancing Learning (SEL) for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection · Chaos-based Image/Signal Encryption
MethodsFocus · Semi-Pseudo-Label
