TextGuider: Training-Free Guidance for Text Rendering via Attention Alignment
Kanghyun Baek, Sangyub Lee, Jin Young Choi, Jaewoo Song, Daemin Park, Jooyoung Choi, Chaehun Shin, Bohyung Han, Sungroh Yoon

TL;DR
TextGuider is a training-free method that improves text rendering accuracy in diffusion models by aligning attention patterns, significantly reducing text omission and enhancing OCR and CLIP scores.
Contribution
It introduces a novel attention alignment technique during inference that enhances text rendering without additional training or fine-tuning.
Findings
Achieves state-of-the-art text rendering performance
Significant improvements in OCR accuracy
Strong results in CLIP score
Abstract
Despite recent advances, diffusion-based text-to-image models still struggle with accurate text rendering. Several studies have proposed fine-tuning or training-free refinement methods for accurate text rendering. However, the critical issue of text omission, where the desired text is partially or entirely missing, remains largely overlooked. In this work, we propose TextGuider, a novel training-free method that encourages accurate and complete text appearance by aligning textual content tokens and text regions in the image. Specifically, we analyze attention patterns in Multi-Modal Diffusion Transformer(MM-DiT) models, particularly for text-related tokens intended to be rendered in the image. Leveraging this observation, we apply latent guidance during the early stage of denoising steps based on two loss functions that we introduce. Our method achieves state-of-the-art performance in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Handwritten Text Recognition Techniques · Generative Adversarial Networks and Image Synthesis
