ModeNet: Mode Selection Network For Learned Video Coding
Th\'eo Ladune (IETR), Pierrick Philippe, Wassim Hamidouche (IETR), Lu, Zhang (IETR), Olivier D\'eforges (IETR)

TL;DR
ModeNet introduces a pixel-wise mode selection network that improves deep learning-based video compression by enabling competition among coding modes, leading to enhanced performance in P-frame coding tasks.
Contribution
It presents a novel, trainable mode selection network that dynamically assigns coding modes at pixel level, improving learned video compression systems.
Findings
Achieves strong performance on CLIC20 P-frame coding track
Enables competition among coding modes for better compression
Flexible component adaptable to other coding systems
Abstract
In this paper, a mode selection network (ModeNet) is proposed to enhance deep learning-based video compression. Inspired by traditional video coding, ModeNet purpose is to enable competition among several coding modes. The proposed ModeNet learns and conveys a pixel-wise partitioning of the frame, used to assign each pixel to the most suited coding mode. ModeNet is trained alongside the different coding modes to minimize a rate-distortion cost. It is a flexible component which can be generalized to other systems to allow competition between different coding tools. Mod-eNet interest is studied on a P-frame coding task, where it is used to design a method for coding a frame given its prediction. ModeNet-based systems achieve compelling performance when evaluated under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding track conditions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Vision and Imaging · Advanced Data Compression Techniques
