Learning Cross-Scale Weighted Prediction for Efficient Neural Video Compression
Zongyu Guo, Runsen Feng, Zhizheng Zhang, Xin Jin, Zhibo Chen

TL;DR
This paper introduces a novel neural video codec with cross-scale weighted prediction and multi-stage quantization, achieving competitive performance with standard codecs like VVC by enhancing motion compensation and rate-distortion efficiency.
Contribution
It proposes a new cross-scale prediction module with weighted prediction and a multi-stage quantization strategy, improving neural video compression adaptability and performance.
Findings
ENVC competes with VVC in sRGB PSNR on UVG dataset.
Cross-scale prediction effectively handles diverse video content.
Multi-stage quantization enhances rate-distortion performance.
Abstract
Neural video codecs have demonstrated great potential in video transmission and storage applications. Existing neural hybrid video coding approaches rely on optical flow or Gaussian-scale flow for prediction, which cannot support fine-grained adaptation to diverse motion content. Towards more content-adaptive prediction, we propose a novel cross-scale prediction module that achieves more effective motion compensation. Specifically, on the one hand, we produce a reference feature pyramid as prediction sources and then transmit cross-scale flows that leverage the feature scale to control the precision of prediction. On the other hand, for the first time, a weighted prediction mechanism is introduced even if only a single reference frame is available, which can help synthesize a fine prediction result by transmitting cross-scale weight maps. In addition to the cross-scale prediction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies
