TL;DR
This paper presents an explainable neural network approach for improving fractional interpolation in video coding, achieving better compression efficiency with lower complexity compared to traditional CNN methods.
Contribution
A novel interpretable neural network framework for fractional interpolation in video coding, enabling efficient training and integration with existing standards.
Findings
Achieves BD-rate savings of up to 2.25% in VVC.
Reduces computational complexity of interpolation.
Provides an explainable, linear-structured neural network model.
Abstract
The versatility of recent machine learning approaches makes them ideal for improvement of next generation video compression solutions. Unfortunately, these approaches typically bring significant increases in computational complexity and are difficult to interpret into explainable models, affecting their potential for implementation within practical video coding applications. This paper introduces a novel explainable neural network-based inter-prediction scheme, to improve the interpolation of reference samples needed for fractional precision motion compensation. The approach requires a single neural network to be trained from which a full quarter-pixel interpolation filter set is derived, as the network is easily interpretable due to its linear structure. A novel training framework enables each network branch to resemble a specific fractional shift. This practical solution makes it very…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
