Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda, Naoto Inoue, Daichi Haraguchi, Hayato Mitani, Seiichi, Uchida, Kota Yamaguchi

TL;DR
Type-R is a post-processing method that improves text rendering accuracy in text-to-image generation by detecting, erasing, regenerating, and correcting text in generated images, leading to more accurate and high-quality results.
Contribution
We introduce Type-R, a novel post-processing pipeline that significantly enhances text accuracy in generated images without compromising visual quality.
Findings
Type-R achieves the highest text rendering accuracy among tested methods.
Type-R maintains high image quality while improving text correctness.
Outperforms existing baselines in balancing text accuracy and image quality.
Abstract
While recent text-to-image models can generate photorealistic images from text prompts that reflect detailed instructions, they still face significant challenges in accurately rendering words in the image. In this paper, we propose to retouch erroneous text renderings in the post-processing pipeline. Our approach, called Type-R, identifies typographical errors in the generated image, erases the erroneous text, regenerates text boxes for missing words, and finally corrects typos in the rendered words. Through extensive experiments, we show that Type-R, in combination with the latest text-to-image models such as Stable Diffusion or Flux, achieves the highest text rendering accuracy while maintaining image quality and also outperforms text-focused generation baselines in terms of balancing text accuracy and image quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Image Retrieval and Classification Techniques · Handwritten Text Recognition Techniques
MethodsDiffusion
