Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment

Sangha Park; Eunji Kim; Yeongtak Oh; Jooyoung Choi; Sungroh Yoon

arXiv:2512.07702·cs.CV·December 12, 2025

Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment

Sangha Park, Eunji Kim, Yeongtak Oh, Jooyoung Choi, Sungroh Yoon

PDF

Open Access

TL;DR

This paper introduces an automated negative prompting method called NPC that improves text-image alignment in diffusion models by identifying and applying negative prompts to suppress unwanted content, without additional image synthesis.

Contribution

The paper presents a novel automated pipeline for negative prompting that enhances text-image alignment by analyzing attention patterns and selecting effective negative prompts without extra image generation.

Findings

01

NPC outperforms strong baselines on GenEval++ and Imagine-Bench datasets.

02

The method achieves higher alignment scores, demonstrating effectiveness.

03

Automated negative prompting provides a principled approach to improve diffusion model outputs.

Abstract

Despite substantial progress in text-to-image generation, achieving precise text-image alignment remains challenging, particularly for prompts with rich compositional structure or imaginative elements. To address this, we introduce Negative Prompting for Image Correction (NPC), an automated pipeline that improves alignment by identifying and applying negative prompts that suppress unintended content. We begin by analyzing cross-attention patterns to explain why both targeted negatives-those directly tied to the prompt's alignment error-and untargeted negatives-tokens unrelated to the prompt but present in the generated image-can enhance alignment. To discover useful negatives, NPC generates candidate prompts using a verifier-captioner-proposer framework and ranks them with a salient text-space score, enabling effective selection without requiring additional image synthesis. On GenEval++…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Digital Humanities and Scholarship