TL;DR
This paper introduces a novel entropy model for neural video compression that captures spatial-temporal dependencies and enables content-adaptive quantization, significantly improving compression efficiency over existing methods.
Contribution
It proposes a new entropy model that leverages spatial and temporal correlations and includes a content-adaptive quantization mechanism, advancing neural video codec performance.
Findings
Achieves 18.2% bitrate reduction on UVG dataset compared to H.266 (VTM).
Introduces a versatile entropy model for better rate-distortion performance.
Enables smooth rate adjustment through dynamic bit allocation.
Abstract
For neural video codec, it is critical, yet challenging, to design an efficient entropy model which can accurately predict the probability distribution of the quantized latent representation. However, most existing video codecs directly use the ready-made entropy model from image codec to encode the residual or motion, and do not fully leverage the spatial-temporal characteristics in video. To this end, this paper proposes a powerful entropy model which efficiently captures both spatial and temporal dependencies. In particular, we introduce the latent prior which exploits the correlation among the latent representation to squeeze the temporal redundancy. Meanwhile, the dual spatial prior is proposed to reduce the spatial redundancy in a parallel-friendly manner. In addition, our entropy model is also versatile. Besides estimating the probability distribution, our entropy model also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
