Perceptual Vector Quantization For Video Coding
Jean-Marc Valin, Timothy B. Terriberry

TL;DR
This paper introduces a perceptual vector quantization method for video coding that conserves energy to better preserve textures, resulting in significant bitrate reductions compared to scalar quantization.
Contribution
It applies gain-shape vector quantization to video encoding, leveraging energy conservation principles from audio coding to improve texture preservation and reduce bitrate.
Findings
Achieves an average of 0.90 dB improvement on still images.
Reduces bitrate by approximately 24.8% at equal quality for images.
Improves video compression efficiency by about 13.7%.
Abstract
This paper applies energy conservation principles to the Daala video codec using gain-shape vector quantization to encode a vector of AC coefficients as a length (gain) and direction (shape). The technique originates from the CELT mode of the Opus audio codec, where it is used to conserve the spectral envelope of an audio signal. Conserving energy in video has the potential to preserve textures rather than low-passing them. Explicitly quantizing a gain allows a simple contrast masking model with no signaling cost. Vector quantizing the shape keeps the number of degrees of freedom the same as scalar quantization, avoiding redundancy in the representation. We demonstrate how to predict the vector by transforming the space it is encoded in, rather than subtracting off the predictor, which would make energy conservation impossible. We also derive an encoding of the vector-quantized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
