Continuous Visual Autoregressive Generation via Score Maximization

Chenze Shao; Fandong Meng; Jie Zhou

arXiv:2505.07812·cs.CV·May 13, 2025

Continuous Visual Autoregressive Generation via Score Maximization

Chenze Shao, Fandong Meng, Jie Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel continuous visual autoregressive model that directly generates visual data without quantization, using score maximization based on proper scoring rules, especially the energy score, to improve generative quality.

Contribution

The paper proposes a continuous VAR framework leveraging proper scoring rules, enabling direct visual data generation without quantization, which is a significant advancement over traditional methods.

Findings

01

Framework based on proper scoring rules improves generation quality.

02

Energy score enables likelihood-free training in continuous space.

03

Connections established with previous methods like GIVT and diffusion loss.

Abstract

Conventional wisdom suggests that autoregressive models are used to process discrete data. When applied to continuous modalities such as visual data, Visual AutoRegressive modeling (VAR) typically resorts to quantization-based approaches to cast the data into a discrete space, which can introduce significant information loss. To tackle this issue, we introduce a Continuous VAR framework that enables direct visual autoregressive generation without vector quantization. The underlying theoretical foundation is strictly proper scoring rules, which provide powerful statistical tools capable of evaluating how well a generative model approximates the true distribution. Within this framework, all we need is to select a strictly proper score and set it as the training objective to optimize. We primarily explore a class of training objectives based on the energy score, which is likelihood-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shaochenze/ear
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning

MethodsSparse Evolutionary Training · Diffusion