ExtraVAR: Stage-Aware RoPE Remapping for Resolution Extrapolation in Visual Autoregressive Models

Feihong Yan; Shaoyu Liu; Haixuan Wang; Shuai Lu; Linfeng Zhang; Huiqi Li; Xiangyang Ji

arXiv:2605.10045·cs.CV·May 12, 2026

ExtraVAR: Stage-Aware RoPE Remapping for Resolution Extrapolation in Visual Autoregressive Models

Feihong Yan, Shaoyu Liu, Haixuan Wang, Shuai Lu, Linfeng Zhang, Huiqi Li, Xiangyang Ji

PDF

1 Repo

TL;DR

ExtraVAR introduces a stage-aware RoPE remapping and adaptive attention calibration to improve high-resolution image synthesis in visual autoregressive models, addressing failure modes caused by scale-wise band mismatches.

Contribution

The paper proposes a novel, training-free method for resolution extrapolation in VAR models that suppresses failure modes by remapping frequency bands and calibrates attention dispersion adaptively.

Findings

01

Outperforms prior methods in structural coherence.

02

Enhances fine-detail fidelity at higher resolutions.

03

Effectively suppresses repetition and detail degradation.

Abstract

Visual Autoregressive (VAR) models have emerged as a strong alternative to diffusion for image synthesis, yet their fixed training resolution prevents direct generation at higher resolutions. Naively transferring training-free extrapolation methods from LLMs or diffusion models to VAR yields three characteristic failure modes: global repetition, local repetition, and detail degradation. We trace them to a unified band-stage mismatch: VAR generates images in a coarse-to-fine, scale-wise process where each stage is driven by a distinct dominant RoPE frequency band, and each failure mode emerges when the dominant band of a particular stage is disrupted. Building on this insight, we propose Stage-Aware RoPE Remapping, a training-free strategy that assigns each frequency band a stage-specific remapping rule, jointly suppressing all three failure modes. We further observe that attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

feihongyan1/ExtraVAR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.