Q-ARVD: Quantizing Autoregressive Video Diffusion Models

Siao Tang; Xinyin Ma; Gongfan Fang; Xingyi Yang; Xinchao Wang

arXiv:2605.21072·cs.CV·May 21, 2026

Q-ARVD: Quantizing Autoregressive Video Diffusion Models

Siao Tang, Xinyin Ma, Gongfan Fang, Xingyi Yang, Xinchao Wang

PDF

1 Repo

TL;DR

Q-ARVD introduces a novel quantization framework specifically designed for autoregressive video diffusion models, addressing unique challenges like frame sensitivity and outliers to enable efficient real-time video generation.

Contribution

It proposes the first tailored quantization method for ARVDs, improving efficiency while maintaining high-quality video generation through innovative sensitivity handling and outlier management.

Findings

01

Q-ARVD outperforms existing quantization schemes on ARVDs.

02

The method effectively manages outliers and frame sensitivity issues.

03

Significant reduction in inference cost demonstrated.

Abstract

Autoregressive video diffusion models (ARVDs) have emerged as a promising architecture for streaming video generation, paving the way for real-time interactive video generation and world modeling. Despite their potential, the substantial inference cost of ARVDs remains a major obstacle to practical deployment, making model quantization a natural direction for improving efficiency. However, quantization for ARVDs remains largely unexplored. Our empirical analysis shows that directly applying existing quantization schemes developed for standard diffusion transformers to ARVDs leads to suboptimal performance, revealing quantization behaviors that differ from those observed in bidirectional diffusion models. In this paper, we identify two critical challenges in quantizing ARVDs: (C1) Highly unbalanced frame-wise quantization sensitivity. Error accumulation during autoregressive generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tsa18/Q-ARVD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.