Evaluating Generative Models via One-Dimensional Code Distributions

Zexi Jia; Pengcheng Luo; Yijia Zhong; Jinchao Zhang; Jie Zhou

arXiv:2603.08064·cs.CV·March 13, 2026

Evaluating Generative Models via One-Dimensional Code Distributions

Zexi Jia, Pengcheng Luo, Yijia Zhong, Jinchao Zhang, Jie Zhou

PDF

Open Access 1 Datasets

TL;DR

This paper introduces new token-based metrics for evaluating generative models that better correlate with human perception by analyzing discrete visual tokens, and presents a comprehensive benchmark dataset for stress-testing these metrics.

Contribution

It proposes Codebook Histogram Distance and Code Mixture Model Score as novel, training-free, token-based evaluation metrics, and introduces VisForm, a large benchmark dataset for assessing generative model quality.

Findings

01

Token-based metrics outperform feature-distribution metrics in correlating with human judgments.

02

The proposed metrics are training-free and applicable across diverse models and data.

03

VisForm benchmark enables robust evaluation under broad distribution shifts.

Abstract

Most evaluations of generative models rely on feature-distribution metrics such as FID, which operate on continuous recognition features that are explicitly trained to be invariant to appearance variations, and thus discard cues critical for perceptual quality. We instead evaluate models in the space of discrete visual tokens, where modern 1D image tokenizers compactly encode both semantic and perceptual information and quality manifests as predictable token statistics. We introduce Codebook Histogram Distance (CHD), a training-free distribution metric in token space, and Code Mixture Model Score (CMMS), a no-reference quality metric learned from synthetic degradations of token sequences. To stress-test metrics under broad distribution shifts, we further propose VisForm, a benchmark of 210K images spanning 62 visual forms and 12 generative models with expert annotations. Across AGIQA,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ZexiJia/Visform
dataset· 28 dl
28 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning