Grade Inflation in Generative Models

Phuc Nguyen; Miao Li; Alexandra Morgan; Rima Arnaout; and Ramy Arnaout

arXiv:2501.00664·cs.AI·January 24, 2025

Grade Inflation in Generative Models

Phuc Nguyen, Miao Li, Alexandra Morgan, Rima Arnaout, and Ramy Arnaout

PDF

TL;DR

This paper identifies the problem of grade inflation in common quality scores for generative models, introduces the Eden score as a solution, and demonstrates its better alignment with human perception and avoidance of grade inflation.

Contribution

The paper introduces the Eden score, the first equidensity score, which avoids grade inflation and better matches human perception in evaluating generative models.

Findings

01

Most common scores suffer from grade inflation.

02

Eden score avoids grade inflation and aligns with human perception.

03

Equidensity scores relate to Rényi entropy.

Abstract

Generative models hold great potential, but only if one can trust the evaluation of the data they generate. We show that many commonly used quality scores for comparing two-dimensional distributions of synthetic vs. ground-truth data give better results than they should, a phenomenon we call the "grade inflation problem." We show that the correlation score, Jaccard score, earth-mover's score, and Kullback-Leibler (relative-entropy) score all suffer grade inflation. We propose that any score that values all datapoints equally, as these do, will also exhibit grade inflation; we refer to such scores as "equipoint" scores. We introduce the concept of "equidensity" scores, and present the Eden score, to our knowledge the first example of such a score. We found that Eden avoids grade inflation and agrees better with human perception of goodness-of-fit than the equipoint scores above. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.