Detail++: Training-Free Detail Enhancer for Text-to-Image Diffusion Models

Lifeng Chen; Jiner Wang; Zihao Pan; Beier Zhu; Xiaofeng Yang; Chi Zhang

arXiv:2507.17853·cs.CV·July 25, 2025

Detail++: Training-Free Detail Enhancer for Text-to-Image Diffusion Models

Lifeng Chen, Jiner Wang, Zihao Pan, Beier Zhu, Xiaofeng Yang, Chi Zhang

PDF

Open Access

TL;DR

Detail++ is a training-free framework that improves text-to-image generation by decomposing complex prompts into stages, guiding global layout first and then refining details, especially for multiple subjects and styles.

Contribution

It introduces a novel Progressive Detail Injection strategy and a Centroid Alignment Loss, enabling staged, accurate, and style-consistent image synthesis without additional training.

Findings

01

Outperforms existing methods on T2I-CompBench

02

Enhances attribute binding accuracy

03

Improves handling of complex prompts with multiple objects

Abstract

Recent advances in text-to-image (T2I) generation have led to impressive visual results. However, these models still face significant challenges when handling complex prompt, particularly those involving multiple subjects with distinct attributes. Inspired by the human drawing process, which first outlines the composition and then incrementally adds details, we propose Detail++, a training-free framework that introduces a novel Progressive Detail Injection (PDI) strategy to address this limitation. Specifically, we decompose a complex prompt into a sequence of simplified sub-prompts, guiding the generation process in stages. This staged generation leverages the inherent layout-controlling capacity of self-attention to first ensure global composition, followed by precise refinement. To achieve accurate binding between attributes and corresponding subjects, we exploit cross-attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection