QVD: Post-training Quantization for Video Diffusion Models

Shilong Tian; Hong Chen; Chengtao Lv; Yu Liu; Jinyang Guo; Xianglong; Liu; Shengxi Li; Hao Yang; Tao Xie

arXiv:2407.11585·cs.CV·July 18, 2024

QVD: Post-training Quantization for Video Diffusion Models

Shilong Tian, Hong Chen, Chengtao Lv, Yu Liu, Jinyang Guo, Xianglong, Liu, Shengxi Li, Hao Yang, Tao Xie

PDF

Open Access

TL;DR

This paper introduces QVD, a post-training quantization method tailored for video diffusion models, addressing their high memory and latency issues by proposing techniques that preserve temporal discriminability and improve channel coverage.

Contribution

We develop the first PTQ strategy for VDMs, including HTDQ for temporal features and SCRI for channel coverage, significantly enhancing quantization performance.

Findings

01

Achieved near-lossless performance at W8A8 bit-width.

02

Outperformed existing methods by 205.12 in FVD.

03

Validated effectiveness across various models and datasets.

Abstract

Recently, video diffusion models (VDMs) have garnered significant attention due to their notable advancements in generating coherent and realistic video content. However, processing multiple frame features concurrently, coupled with the considerable model size, results in high latency and extensive memory consumption, hindering their broader application. Post-training quantization (PTQ) is an effective technique to reduce memory footprint and improve computational efficiency. Unlike image diffusion, we observe that the temporal features, which are integrated into all frame features, exhibit pronounced skewness. Furthermore, we investigate significant inter-channel disparities and asymmetries in the activation of video diffusion models, resulting in low coverage of quantization levels by individual channels and increasing the challenge of quantization. To address these issues, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Signal Denoising Methods · Medical Imaging Techniques and Applications

MethodsSoftmax · Attention Is All You Need · Diffusion