End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation
Haojie Liu, Ming Lu, Zhiqi Chen, Xun Cao, Zhan Ma, Yao Wang

TL;DR
This paper introduces a hybrid neural video coding method that combines vector-based and adaptive kernel-based resampling techniques through a compound spatiotemporal representation, achieving state-of-the-art compression efficiency.
Contribution
It proposes a novel hybrid motion compensation approach with a compound spatiotemporal representation and a multi-prediction decoder, improving robustness and coding efficiency over existing methods.
Findings
Outperforms traditional codecs like H.264 and H.265 in PSNR and MS-SSIM.
Provides better motion prediction and robustness to occlusions.
Achieves state-of-the-art performance in low-delay scenarios.
Abstract
Recent years have witnessed rapid advances in learnt video coding. Most algorithms have solely relied on the vector-based motion representation and resampling (e.g., optical flow based bilinear sampling) for exploiting the inter frame redundancy. In spite of the great success of adaptive kernel-based resampling (e.g., adaptive convolutions and deformable convolutions) in video prediction for uncompressed videos, integrating such approaches with rate-distortion optimization for inter frame coding has been less successful. Recognizing that each resampling solution offers unique advantages in regions with different motion and texture characteristics, we propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by these two approaches. Specifically, we generate a compound spatiotemporal representation (CSTR) through a recurrent information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies
