Generative Refinement Networks for Visual Synthesis

Jian Han; Jinlai Liu; Jiahuan Wang; Bingyue Peng; Zehuan Yuan

arXiv:2604.13030·cs.CV·April 15, 2026

Generative Refinement Networks for Visual Synthesis

Jian Han, Jinlai Liu, Jiahuan Wang, Bingyue Peng, Zehuan Yuan

PDF

2 Repos 2 Models

TL;DR

Generative Refinement Networks (GRN) introduce a near-lossless hierarchical binary quantization and a global refinement mechanism, enabling efficient, high-quality visual synthesis across various tasks.

Contribution

GRN combines a near-lossless hierarchical binary quantization with a global refinement process and entropy-guided sampling, advancing autoregressive models for visual generation.

Findings

01

Achieved 0.56 rFID in image reconstruction on ImageNet.

02

Set new records with 1.81 gFID in class-conditional image generation.

03

Demonstrated superior performance in text-to-image and text-to-video tasks.

Abstract

While diffusion models dominate the field of visual generation, they are computationally inefficient, applying a uniform computational effort regardless of different complexity. In contrast, autoregressive (AR) models are inherently complexity-aware, as evidenced by their variable likelihoods, but are often hindered by lossy discrete tokenization and error accumulation. In this work, we introduce Generative Refinement Networks (GRN), a next-generation visual synthesis paradigm to address these issues. At its core, GRN addresses the discrete tokenization bottleneck through a theoretically near-lossless Hierarchical Binary Quantization (HBQ), achieving a reconstruction quality comparable to continuous counterparts. Built upon HBQ's latent space, GRN fundamentally upgrades AR generation with a global refinement mechanism that progressively perfects and corrects artworks -- like a human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.