VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping

Haotian Dong; Ye Li; Rongwei Lu; Chen Tang; Shu-Tao Xia; Zhi Wang

arXiv:2511.13587·cs.CV·May 7, 2026

VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping

Haotian Dong, Ye Li, Rongwei Lu, Chen Tang, Shu-Tao Xia, Zhi Wang

PDF

1 Repo

TL;DR

VVS introduces a novel speculative decoding framework that significantly reduces inference latency in visual autoregressive models by skipping verification steps through partial verification skipping.

Contribution

It proposes a new SD framework with verification skipping, token selection, feature caching, and step scheduling to accelerate visual AR models without quality loss.

Findings

01

Reduces target model forward passes by 2.8×

02

Maintains competitive generation quality

03

Outperforms conventional SD frameworks in speed-quality trade-off

Abstract

Visual autoregressive (AR) generation models have demonstrated strong potential for image generation, yet their next-token-prediction paradigm introduces considerable inference latency. Although speculative decoding (SD) has been proven effective for accelerating visual AR models, its "draft one step, then verify one step" paradigm prevents a direct reduction in the number of forward passes, limiting its acceleration potential. Motivated by the interchangeability of visual tokens, we explore verification skipping in the SD process for the first time to explicitly cut the number of target model forward passes, thereby reducing inference latency. By analyzing the characteristics of the drafting stage, we observe that verification redundancy and stale feature reusability are key factors to maintain generation quality while improving speed for verification-free steps. Inspired by these two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HyattDD/VVS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.