TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models
Georgia Gabriela Sampaio, Ruixiang Zhang, Shuangfei Zhai, Jiatao Gu,, Josh Susskind, Navdeep Jaitly, Yizhe Zhang

TL;DR
TypeScore is a new evaluation metric that accurately measures a text-to-image model's ability to generate images with high-fidelity embedded text, providing finer discrimination than existing metrics like CLIPScore.
Contribution
This work introduces TypeScore, a novel metric that assesses the fidelity of embedded text in generated images, enhancing evaluation sensitivity for instruction-following capabilities.
Findings
TypeScore outperforms CLIPScore in differentiating models based on embedded text fidelity.
The metric effectively evaluates stylistic adherence in image generation.
Human studies validate the effectiveness of TypeScore as an evaluation tool.
Abstract
Evaluating text-to-image generative models remains a challenge, despite the remarkable progress being made in their overall performances. While existing metrics like CLIPScore work for coarse evaluations, they lack the sensitivity to distinguish finer differences as model performance rapidly improves. In this work, we focus on the text rendering aspect of these models, which provides a lens for evaluating a generative model's fine-grained instruction-following capabilities. To this end, we introduce a new evaluation framework called TypeScore to sensitively assess a model's ability to generate images with high-fidelity embedded text by following precise instructions. We argue that this text generation capability serves as a proxy for general instruction-following ability in image synthesis. TypeScore uses an additional image description model and leverages an ensemble dissimilarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Video Analysis and Summarization
MethodsFocus
