EvalGIM: A Library for Evaluating Generative Image Models

Melissa Hall; Oscar Ma\~nas; Reyhane Askari-Hemmat; Mark Ibrahim,; Candace Ross; Pietro Astolfi; Tariq Berrada Ifriqi; Marton Havasi; Yohann; Benchetrit; Karen Ullrich; Carolina Braga; Abhishek Charnalia; Maeve Ryan,; Mike Rabbat; Michal Drozdzal; Jakob Verbeek; Adriana Romero-Soriano

arXiv:2412.10604·cs.CV·December 19, 2024

EvalGIM: A Library for Evaluating Generative Image Models

Melissa Hall, Oscar Ma\~nas, Reyhane Askari-Hemmat, Mark Ibrahim,, Candace Ross, Pietro Astolfi, Tariq Berrada Ifriqi, Marton Havasi, Yohann, Benchetrit, Karen Ullrich, Carolina Braga, Abhishek Charnalia, Maeve Ryan,, Mike Rabbat, Michal Drozdzal, Jakob Verbeek

PDF

Open Access 1 Repo

TL;DR

EvalGIM is a flexible, unified library for evaluating text-to-image generative models, supporting multiple datasets and metrics, and providing actionable insights through novel analysis methods.

Contribution

It introduces a comprehensive, customizable benchmarking framework with new evaluation techniques for assessing generative image models.

Findings

01

Supports broad datasets and metrics for quality, diversity, and consistency

02

Includes state-of-the-art evaluation methods like Pareto Fronts and performance disparity measurements

03

Provides new analysis tools for robustness and prompt style balance

Abstract

As the use of text-to-image generative models increases, so does the adoption of automatic benchmarking methods used in their evaluation. However, while metrics and datasets abound, there are few unified benchmarking libraries that provide a framework for performing evaluations across many datasets and metrics. Furthermore, the rapid introduction of increasingly robust benchmarking methods requires that evaluation libraries remain flexible to new datasets and metrics. Finally, there remains a gap in synthesizing evaluations in order to deliver actionable takeaways about model performance. To enable unified, flexible, and actionable evaluations, we introduce EvalGIM (pronounced ''EvalGym''), a library for evaluating generative image models. EvalGIM contains broad support for datasets and metrics used to measure quality, diversity, and consistency of text-to-image generative models. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/evalgim
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis · Medical Image Segmentation Techniques

MethodsLib