A Study on the Evaluation of Generative Models
Eyal Betzalel, Coby Penso, Aviv Navon, Ethan Fetaya

TL;DR
This paper investigates the effectiveness of current evaluation metrics for implicit generative models, revealing their limitations in fine-grained model comparison and exploring better correlates with probabilistic metrics.
Contribution
The study provides a systematic analysis of evaluation metrics like FID and IS, highlighting their inconsistencies and proposing insights into their correlation with probabilistic measures.
Findings
FID and IS correlate with certain f-divergences
Ranking of similar models varies across metrics
Evaluation metrics' base features impact their effectiveness
Abstract
Implicit generative models, which do not return likelihood values, such as generative adversarial networks and diffusion models, have become prevalent in recent years. While it is true that these models have shown remarkable results, evaluating their performance is challenging. This issue is of vital importance to push research forward and identify meaningful gains from random noise. Currently, heuristic metrics such as the Inception score (IS) and Frechet Inception Distance (FID) are the most common evaluation metrics, but what they measure is not entirely clear. Additionally, there are questions regarding how meaningful their score actually is. In this work, we study the evaluation metrics of generative models by generating a high-quality synthetic dataset on which we can estimate classical metrics for comparison. Our study shows that while FID and IS do correlate to several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data Visualization and Analytics · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion · Balanced Selection
