How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models
Ahmed M. Alaa, Boris van Breugel, Evgeny Saveliev, Mihaela van der, Schaar

TL;DR
This paper introduces a domain-agnostic, three-dimensional evaluation metric for generative models that assesses fidelity, diversity, and privacy-related generalization, enabling detailed diagnosis and auditing of individual generated samples.
Contribution
The authors propose a novel, sample-level, three-component metric combining statistical divergence and precision-recall analysis for comprehensive generative model evaluation.
Findings
The metric effectively diagnoses mode collapse and failure modes.
It quantifies model generalization and privacy leakage.
Sample-level auditing improves overall model quality.
Abstract
Devising domain- and model-agnostic evaluation metrics for generative models is an important and as yet unresolved problem. Most existing metrics, which were tailored solely to the image synthesis setup, exhibit a limited capacity for diagnosing the different modes of failure of generative models across broader application domains. In this paper, we introduce a 3-dimensional evaluation metric, (-Precision, -Recall, Authenticity), that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity. We introduce generalization as an additional, independent dimension (to the fidelity-diversity trade-off) that quantifies the extent to which a model copies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Topic Modeling · Generative Adversarial Networks and Image Synthesis
