Reliable Fidelity and Diversity Metrics for Generative Models
Muhammad Ferjad Naeem, Seong Joon Oh, Youngjung Uh, Yunjey Choi,, Jaejun Yoo

TL;DR
This paper introduces new density and coverage metrics for evaluating generative models, addressing limitations of existing precision and recall metrics by providing more reliable and interpretable measures of fidelity and diversity.
Contribution
The paper proposes density and coverage metrics that improve reliability and interpretability over existing precision and recall metrics for generative model evaluation.
Findings
Density and coverage metrics detect distribution matches accurately.
They are robust against outliers and hyperparameter choices.
Metrics outperform existing methods in reliability and interpretability.
Abstract
Devising indicative evaluation metrics for the image generation task remains an open problem. The most widely used metric for measuring the similarity between real and generated images has been the Fr\'echet Inception Distance (FID) score. Because it does not differentiate the fidelity and diversity aspects of the generated images, recent papers have introduced variants of precision and recall metrics to diagnose those properties separately. In this paper, we show that even the latest version of the precision and recall metrics are not reliable yet. For example, they fail to detect the match between two identical distributions, they are not robust against outliers, and the evaluation hyperparameters are selected arbitrarily. We propose density and coverage metrics that solve the above issues. We analytically and experimentally show that density and coverage provide more interpretable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Visual Attention and Saliency Detection
