HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models
Sharon Zhou, Mitchell L. Gordon, Ranjay Krishna, Austin Narcomey, Li, Fei-Fei, Michael S. Bernstein

TL;DR
HYPE is a standardized human perceptual benchmark for evaluating generative models' realism, grounded in psychophysics, reliable, cost-effective, and capable of tracking model improvements.
Contribution
This work introduces HYPE, a validated human evaluation benchmark for generative models, addressing the lack of standardized human assessment methods.
Findings
HYPE can reliably track model improvements across training epochs.
HYPE rankings are consistent and replicable.
HYPE effectively distinguishes performance differences among models.
Abstract
Generative models often use human evaluations to measure the perceived quality of their outputs. Automated metrics are noisy indirect proxies, because they rely on heuristics or pretrained embeddings. However, up until now, direct human evaluation strategies have been ad-hoc, neither standardized nor validated. Our work establishes a gold standard human benchmark for generative realism. We construct Human eYe Perceptual Evaluation (HYPE) a human benchmark that is (1) grounded in psychophysics research in perception, (2) reliable across different sets of randomly sampled outputs from a model, (3) able to produce separable model performances, and (4) efficient in cost and time. We introduce two variants: one that measures visual perception under adaptive time constraints to determine the threshold at which a model's outputs appear real (e.g. 250ms), and the other a less expensive variant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual perception and processing mechanisms · Visual Attention and Saliency Detection · Neural dynamics and brain function
