ABHINAW: A method for Automatic Evaluation of Typography within   AI-Generated Images

Abhinaw Jagtap; Nachiket Tapas; R. G. Brajesh

arXiv:2409.11874·cs.CV·September 19, 2024

ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images

Abhinaw Jagtap, Nachiket Tapas, R. G. Brajesh

PDF

Open Access

TL;DR

ABHINAW introduces a new evaluation matrix for accurately assessing text and typography quality in AI-generated images, addressing limitations of existing benchmarks and enabling better zero-shot text generation.

Contribution

The paper presents a novel scoring matrix and methods for evaluating text accuracy in AI-generated images, especially for diffusion-based models, with detailed error analysis.

Findings

01

Effective letter-by-letter matching scores for text accuracy

02

Handling of redundancies and excess text through brevity adjustment

03

Quantitative analysis of common text errors in AI images

Abstract

In the fast-evolving field of Generative AI, platforms like MidJourney, DALL-E, and Stable Diffusion have transformed Text-to-Image (T2I) Generation. However, despite their impressive ability to create high-quality images, they often struggle to generate accurate text within these images. Theoretically, if we could achieve accurate text generation in AI images in a ``zero-shot'' manner, it would not only make AI-generated images more meaningful but also democratize the graphic design industry. The first step towards this goal is to create a robust scoring matrix for evaluating text accuracy in AI-generated images. Although there are existing bench-marking methods like CLIP SCORE and T2I-CompBench++, there's still a gap in systematically evaluating text and typography in AI-generated images, especially with diffusion-based methods. In this paper, we introduce a novel evaluation matrix…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation

MethodsDiffusion · Contrastive Language-Image Pre-training