Optical Flow and Mode Selection for Learning-based Video Coding
Th\'eo Ladune (IETR), Pierrick Philippe, Wassim Hamidouche (IETR), Lu, Zhang (IETR), Olivier D\'eforges (IETR)

TL;DR
This paper presents a novel learned inter-frame coding method using autoencoders to predict frames and select coding modes, achieving competitive performance with traditional codecs like HEVC.
Contribution
It introduces MOFNet and CodecNet for end-to-end optical flow and mode selection, enabling improved learned video coding without pre-training.
Findings
Performs on par with HEVC in P-frame coding conditions.
Enables end-to-end learning of optical flow without pre-training.
Uses autoencoders for mode selection and frame prediction.
Abstract
This paper introduces a new method for inter-frame coding based on two complementary autoencoders: MOFNet and CodecNet. MOFNet aims at computing and conveying the Optical Flow and a pixel-wise coding Mode selection. The optical flow is used to perform a prediction of the frame to code. The coding mode selection enables competition between direct copy of the prediction or transmission through CodecNet. The proposed coding scheme is assessed under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding conditions, where it is shown to perform on par with the state-of-the-art video codec ITU/MPEG HEVC. Moreover, the possibility of copying the prediction enables to learn the optical flow in an end-to-end fashion i.e. without relying on pre-training and/or a dedicated loss term.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Advanced Image Processing Techniques
