Hybrid Local-Global Context Learning for Neural Video Compression

Yongqi Zhai; Jiayu Yang; Wei Jiang; Chunhui Yang; Luyang Tang and; Ronggang Wang

arXiv:2412.00446·cs.MM·December 3, 2024

Hybrid Local-Global Context Learning for Neural Video Compression

Yongqi Zhai, Jiayu Yang, Wei Jiang, Chunhui Yang, Luyang Tang and, Ronggang Wang

PDF

Open Access

TL;DR

This paper introduces a hybrid local-global context learning approach for neural video compression that combines flow-guided deformable compensation and flow-based warping to improve accuracy and reduce bit cost.

Contribution

It proposes a novel hybrid context generation module and a local-global context enhancement technique, improving motion compensation efficiency in neural video codecs.

Findings

01

Significant performance improvement over state-of-the-art methods

02

Effective reduction in bit cost for motion coding

03

Enhanced accuracy in complex scene motion estimation

Abstract

In neural video codecs, current state-of-the-art methods typically adopt multi-scale motion compensation to handle diverse motions. These methods estimate and compress either optical flow or deformable offsets to reduce inter-frame redundancy. However, flow-based methods often suffer from inaccurate motion estimation in complicated scenes. Deformable convolution-based methods are more robust but have a higher bit cost for motion coding. In this paper, we propose a hybrid context generation module, which combines the advantages of the above methods in an optimal way and achieves accurate compensation at a low bit cost. Specifically, considering the characteristics of features at different scales, we adopt flow-guided deformable compensation at largest-scale to produce accurate alignment in detailed regions. For smaller-scale features, we perform flow-based warping to save the bit cost…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Video Analysis and Summarization · Speech Recognition and Synthesis

Methods1x1 Convolution · Average Pooling · Global Average Pooling · ADaptive gradient method with the OPTimal convergence rate · Context Enhancement Module