MAUVE Scores for Generative Models: Theory and Practice
Krishna Pillutla, Lang Liu, John Thickstun, Sean Welleck, Swabha, Swayamdipta, Rowan Zellers, Sewoong Oh, Yejin Choi, Zaid Harchaoui

TL;DR
MAUVE is a new set of statistical scores designed to compare generative models' output distributions with real data, effectively measuring quality in text and images and correlating well with human judgments.
Contribution
This paper introduces MAUVE scores, a novel family of divergence-based metrics for evaluating generative models across text and image domains, with theoretical bounds and practical estimation methods.
Findings
MAUVE scores correlate with human judgments of text quality.
MAUVE effectively identifies properties of generated images.
The proposed methods outperform existing metrics in certain scenarios.
Abstract
Generative artificial intelligence has made significant strides, producing text indistinguishable from human prose and remarkably photorealistic images. Automatically measuring how close the generated data distribution is to the target distribution is central to diagnosing existing models and developing better ones. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore three approaches to statistically estimate these scores: vector quantization, non-parametric estimation, and classifier-based estimation. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of -divergences and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computational and Text Analysis Methods · Multimodal Machine Learning Applications
