Neural Video Compression with Diverse Contexts
Jiahao Li, Bin Li, Yan Lu

TL;DR
This paper enhances neural video codecs by increasing context diversity across temporal and spatial dimensions, leading to significant bitrate savings and surpassing traditional codecs in quality.
Contribution
It introduces hierarchical quality pattern learning, group-based offset diversity, and quadtree spatial partitioning to improve context utilization in neural video compression.
Findings
Achieved 23.5% bitrate reduction over previous SOTA NVC.
Surpassed traditional codecs in PSNR for RGB and YUV420.
Demonstrated effective context diversity strategies improve compression efficiency.
Abstract
For any video codecs, the coding efficiency highly relies on whether the current signal to be encoded can find the relevant contexts from the previous reconstructed signals. Traditional codec has verified more contexts bring substantial coding gain, but in a time-consuming manner. However, for the emerging neural video codec (NVC), its contexts are still limited, leading to low compression ratio. To boost NVC, this paper proposes increasing the context diversity in both temporal and spatial dimensions. First, we guide the model to learn hierarchical quality patterns across frames, which enriches long-term and yet high-quality temporal contexts. Furthermore, to tap the potential of optical flow-based coding framework, we introduce a group-based offset diversity where the cross-group interaction is proposed for better context mining. In addition, this paper also adopts a quadtree-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image and Signal Denoising Methods · Advanced Data Compression Techniques
