Visual Autoregressive Modeling for Image Super-Resolution
Yunpeng Qu, Kun Yuan, Jinhua Hao, Kai Zhao, Qizhi Xie, Ming Sun, Chao, Zhou

TL;DR
This paper introduces VARSR, a novel autoregressive model for image super-resolution that effectively balances fidelity and realism, leveraging semantic preservation, spatial encoding, and guidance techniques for high-quality image generation.
Contribution
The paper proposes VARSR, a new autoregressive framework for ISR that integrates semantic tokens, spatial encodings, and guidance to improve image quality and efficiency.
Findings
Outperforms diffusion-based methods in quality and efficiency.
Generates high-fidelity, realistic images with semantic consistency.
Utilizes large-scale data and robust training for superior priors.
Abstract
Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also posed limitations on their application. Building upon the tremendous success of autoregressive models in the language domain, we propose \textbf{VARSR}, a novel visual autoregressive modeling for ISR framework with the form of next-scale prediction. To effectively integrate and preserve semantic information in low-resolution images, we propose using prefix tokens to incorporate the condition. Scale-aligned Rotary Positional Encodings are introduced to capture spatial structures and the diffusion refiner is utilized for modeling quantization residual loss to achieve pixel-level fidelity. Image-based Classifier-free Guidance is proposed to guide the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging · Image Processing Techniques and Applications
MethodsDiffusion
