Transform Network Architectures for Deep Learning based End-to-End   Image/Video Coding in Subsampled Color Spaces

Hilmi E. Egilmez; Ankitesh K. Singh; Muhammed Coban; Marta Karczewicz,; Yinhao Zhu; Yang Yang; Amir Said; Taco S. Cohen

arXiv:2103.01760·eess.IV·August 30, 2021

Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

Hilmi E. Egilmez, Ankitesh K. Singh, Muhammed Coban, Marta Karczewicz,, Yinhao Zhu, Yang Yang, Amir Said, Taco S. Cohen

PDF

TL;DR

This paper explores deep learning architectures for end-to-end image/video coding in subsampled YUV 4:2:0 format, proposing a new transform network that outperforms existing methods and standard codecs like HEVC.

Contribution

It introduces a novel transform network architecture tailored for YUV 4:2:0 format, enhancing coding efficiency in deep learning based end-to-end image/video compression.

Findings

01

Proposed architecture achieves about 10% BD-rate reduction over HEVC intra coding.

02

Supports YUV 4:2:0 format, aligning with traditional video compression standards.

03

Outperforms naive extensions of RGB-based architectures in experiments.

Abstract

Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.