TeTIm-Eval: a novel curated evaluation data set for comparing text-to-image models
Federico A. Galatolo, Mario G. C. A. Cimino, Edoardo Cogotti

TL;DR
This paper introduces TeTIm-Eval, a curated dataset and evaluation framework for comparing text-to-image models, combining high-quality data, CLIP-score, and human judgment to improve assessment accuracy.
Contribution
It presents a new curated dataset and evaluation methodology that integrates quantitative and human assessments for better comparison of text-to-image models.
Findings
Human judgment accuracy aligns with CLIP-score results
The dataset covers ten diverse categories
Evaluation method applied successfully to recent models
Abstract
Evaluating and comparing text-to-image models is a challenging problem. Significant advances in the field have recently been made, piquing interest of various industrial sectors. As a consequence, a gold standard in the field should cover a variety of tasks and application contexts. In this paper a novel evaluation approach is experimented, on the basis of: (i) a curated data set, made by high-quality royalty-free image-text pairs, divided into ten categories; (ii) a quantitative metric, the CLIP-score, (iii) a human evaluation task to distinguish, for a given text, the real and the generated images. The proposed method has been applied to the most recent models, i.e., DALLE2, Latent Diffusion, Stable Diffusion, GLIDE and Craiyon. Early experimental results show that the accuracy of the human judgement is fully coherent with the CLIP-score. The dataset has been made available to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Handwritten Text Recognition Techniques · Computer Graphics and Visualization Techniques
MethodsDiffusion · Guided Language to Image Diffusion for Generation and Editing
