Safeguarding Text-to-Image Generation via Inference-Time Prompt-Noise Optimization
Jiangweizhi Peng, Zhiwei Tang, Gaowen Liu, Charles Fleming, Mingyi, Hong

TL;DR
This paper introduces a training-free prompt-noise optimization method that enhances the safety of text-to-image diffusion models by reducing unsafe content generation and resisting adversarial attacks without additional model tuning.
Contribution
The paper proposes a novel, training-free optimization framework that uses prompt embedding and noise trajectory manipulation to improve safety in T2I models, outperforming existing methods.
Findings
Achieves state-of-the-art safety performance in T2I models.
Demonstrates robustness against adversarial attacks.
Maintains comparable generation time to existing methods.
Abstract
Text-to-Image (T2I) diffusion models are widely recognized for their ability to generate high-quality and diverse images based on text prompts. However, despite recent advances, these models are still prone to generating unsafe images containing sensitive or inappropriate content, which can be harmful to users. Current efforts to prevent inappropriate image generation for diffusion models are easy to bypass and vulnerable to adversarial attacks. How to ensure that T2I models align with specific safety goals remains a significant challenge. In this work, we propose a novel, training-free approach, called Prompt-Noise Optimization (PNO), to mitigate unsafe image generation. Our method introduces a novel optimization framework that leverages both the continuous prompt embedding and the injected noise trajectory in the sampling process to generate safe images. Extensive numerical results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Handwritten Text Recognition Techniques · Advanced Steganography and Watermarking Techniques
MethodsDiffusion · ALIGN
