Dynamic VLM-Guided Negative Prompting for Diffusion Models

Hoyeon Chang; Seungjin Kim; Yoonseok Choi

arXiv:2510.26052·cs.CV·October 31, 2025

Dynamic VLM-Guided Negative Prompting for Diffusion Models

Hoyeon Chang, Seungjin Kim, Yoonseok Choi

PDF

TL;DR

This paper introduces a dynamic negative prompting technique for diffusion models that uses vision-language models to adaptively generate negative prompts during denoising, improving image quality and alignment.

Contribution

It presents a novel method that dynamically generates negative prompts using VLMs during diffusion, unlike fixed prompt approaches, enhancing flexibility and performance.

Findings

01

Improved image quality and alignment in benchmark tests

02

Effective trade-off management between guidance strength and accuracy

03

Demonstrated adaptability across multiple datasets

Abstract

We propose a novel approach for dynamic negative prompting in diffusion models that leverages Vision-Language Models (VLMs) to adaptively generate negative prompts during the denoising process. Unlike traditional Negative Prompting methods that use fixed negative prompts, our method generates intermediate image predictions at specific denoising steps and queries a VLM to produce contextually appropriate negative prompts. We evaluate our approach on various benchmark datasets and demonstrate the trade-offs between negative guidance strength and text-image alignment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.