ModeNet: Mode Selection Network For Learned Video Coding

Th\'eo Ladune (IETR); Pierrick Philippe; Wassim Hamidouche (IETR); Lu; Zhang (IETR); Olivier D\'eforges (IETR)

arXiv:2007.02532·cs.NE·August 3, 2020

ModeNet: Mode Selection Network For Learned Video Coding

Th\'eo Ladune (IETR), Pierrick Philippe, Wassim Hamidouche (IETR), Lu, Zhang (IETR), Olivier D\'eforges (IETR)

PDF

Open Access

TL;DR

ModeNet introduces a pixel-wise mode selection network that improves deep learning-based video compression by enabling competition among coding modes, leading to enhanced performance in P-frame coding tasks.

Contribution

It presents a novel, trainable mode selection network that dynamically assigns coding modes at pixel level, improving learned video compression systems.

Findings

01

Achieves strong performance on CLIC20 P-frame coding track

02

Enables competition among coding modes for better compression

03

Flexible component adaptable to other coding systems

Abstract

In this paper, a mode selection network (ModeNet) is proposed to enhance deep learning-based video compression. Inspired by traditional video coding, ModeNet purpose is to enable competition among several coding modes. The proposed ModeNet learns and conveys a pixel-wise partitioning of the frame, used to assign each pixel to the most suited coding mode. ModeNet is trained alongside the different coding modes to minimize a rate-distortion cost. It is a flexible component which can be generalized to other systems to allow competition between different coding tools. Mod-eNet interest is studied on a P-frame coding task, where it is used to design a method for coding a frame given its prediction. ModeNet-based systems achieve compelling performance when evaluated under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding track conditions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Coding and Compression Technologies · Advanced Vision and Imaging · Advanced Data Compression Techniques