ResQ: Residual Quantization for Video Perception

Davide Abati; Haitam Ben Yahia; Markus Nagel; Amirhossein Habibian

arXiv:2308.09511·cs.CV·August 21, 2023

ResQ: Residual Quantization for Video Perception

Davide Abati, Haitam Ben Yahia, Markus Nagel, Amirhossein Habibian

PDF

Open Access

TL;DR

ResQ introduces a novel residual quantization method that leverages temporal redundancies in video to improve the efficiency and accuracy of video perception tasks like segmentation and pose estimation.

Contribution

The paper proposes Residual Quantization (ResQ), a new low-bit quantization scheme that incorporates temporal dependencies for enhanced video perception performance.

Findings

01

ResQ outperforms standard quantization methods in accuracy vs. bit-width trade-offs.

02

Dynamic bit-width adjustment improves efficiency based on video changes.

03

ResQ achieves superior results on semantic segmentation and pose estimation benchmarks.

Abstract

This paper accelerates video perception, such as semantic segmentation and human pose estimation, by levering cross-frame redundancies. Unlike the existing approaches, which avoid redundant computations by warping the past features using optical-flow or by performing sparse convolutions on frame differences, we approach the problem from a new perspective: low-bit quantization. We observe that residuals, as the difference in network activations between two neighboring frames, exhibit properties that make them highly quantizable. Based on this observation, we propose a novel quantization scheme for video networks coined as Residual Quantization. ResQ extends the standard, frame-by-frame, quantization scheme by incorporating temporal dependencies that lead to better performance in terms of accuracy vs. bit-width. Furthermore, we extend our model to dynamically adjust the bit-width…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Vision and Imaging · Visual Attention and Saliency Detection

MethodsSparse Convolutions