ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images
Abhinaw Jagtap, Nachiket Tapas, R. G. Brajesh

TL;DR
ABHINAW introduces a new evaluation matrix for accurately assessing text and typography quality in AI-generated images, addressing limitations of existing benchmarks and enabling better zero-shot text generation.
Contribution
The paper presents a novel scoring matrix and methods for evaluating text accuracy in AI-generated images, especially for diffusion-based models, with detailed error analysis.
Findings
Effective letter-by-letter matching scores for text accuracy
Handling of redundancies and excess text through brevity adjustment
Quantitative analysis of common text errors in AI images
Abstract
In the fast-evolving field of Generative AI, platforms like MidJourney, DALL-E, and Stable Diffusion have transformed Text-to-Image (T2I) Generation. However, despite their impressive ability to create high-quality images, they often struggle to generate accurate text within these images. Theoretically, if we could achieve accurate text generation in AI images in a ``zero-shot'' manner, it would not only make AI-generated images more meaningful but also democratize the graphic design industry. The first step towards this goal is to create a robust scoring matrix for evaluating text accuracy in AI-generated images. Although there are existing bench-marking methods like CLIP SCORE and T2I-CompBench++, there's still a gap in systematically evaluating text and typography in AI-generated images, especially with diffusion-based methods. In this paper, we introduce a novel evaluation matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation
MethodsDiffusion · Contrastive Language-Image Pre-training
