Safeguarding Text-to-Image Generation via Inference-Time Prompt-Noise   Optimization

Jiangweizhi Peng; Zhiwei Tang; Gaowen Liu; Charles Fleming; Mingyi; Hong

arXiv:2412.03876·cs.CV·December 6, 2024

Safeguarding Text-to-Image Generation via Inference-Time Prompt-Noise Optimization

Jiangweizhi Peng, Zhiwei Tang, Gaowen Liu, Charles Fleming, Mingyi, Hong

PDF

Open Access 1 Repo

TL;DR

This paper introduces a training-free prompt-noise optimization method that enhances the safety of text-to-image diffusion models by reducing unsafe content generation and resisting adversarial attacks without additional model tuning.

Contribution

The paper proposes a novel, training-free optimization framework that uses prompt embedding and noise trajectory manipulation to improve safety in T2I models, outperforming existing methods.

Findings

01

Achieves state-of-the-art safety performance in T2I models.

02

Demonstrates robustness against adversarial attacks.

03

Maintains comparable generation time to existing methods.

Abstract

Text-to-Image (T2I) diffusion models are widely recognized for their ability to generate high-quality and diverse images based on text prompts. However, despite recent advances, these models are still prone to generating unsafe images containing sensitive or inappropriate content, which can be harmful to users. Current efforts to prevent inappropriate image generation for diffusion models are easy to bypass and vulnerable to adversarial attacks. How to ensure that T2I models align with specific safety goals remains a significant challenge. In this work, we propose a novel, training-free approach, called Prompt-Noise Optimization (PNO), to mitigate unsafe image generation. Our method introduces a novel optimization framework that leverages both the continuous prompt embedding and the injected noise trajectory in the sampling process to generate safe images. Extensive numerical results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JonP07/Diffusion-PNO
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Handwritten Text Recognition Techniques · Advanced Steganography and Watermarking Techniques

MethodsDiffusion · ALIGN