Neural B-Frame Coding: Tackling Domain Shift Issues with Lightweight Online Motion Resolution Adaptation
Sang NguyenQuang, Xiem HoangVan, Wen-Hsiao Peng

TL;DR
This paper presents lightweight classifiers for adaptive downsampling in learned B-frame codecs, effectively addressing domain shift issues with minimal computational overhead and without retraining existing codecs.
Contribution
Introduces three novel lightweight classifiers for predicting downsampling factors, improving rate-distortion performance and reducing complexity in learned B-frame video coding.
Findings
Achieve coding performance comparable to exhaustive search methods.
Significantly reduce computational complexity.
Seamlessly integrate with existing B-frame codecs without retraining.
Abstract
Learned B-frame codecs with hierarchical temporal prediction often encounter the domain-shift issue due to mismatches between the Group-of-Pictures (GOP) sizes for training and testing, leading to inaccurate motion estimates, particularly for large motion. A common solution is to turn large motion into small motion by downsampling video frames during motion estimation. However, determining the optimal downsampling factor typically requires costly rate-distortion optimization. This work introduces lightweight classifiers to predict downsampling factors. These classifiers leverage simple state signals from current and reference frames to balance rate-distortion performance with computational cost. Three variants are proposed: (1) a binary classifier (Bi-Class) trained with Focal Loss to choose between high and low resolutions, (2) a multi-class classifier (Mu-Class) trained with novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Data Compression Techniques · Human Pose and Action Recognition
