CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial   Scalability

Charles Bonnineau; Wassim Hamidouche; Jean-Fran\c{c}ois Travers; Naty; Sidaty; Jean-Yves Aubi\'e; Olivier Deforges

arXiv:2202.00416·eess.IV·February 2, 2022

CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial Scalability

Charles Bonnineau, Wassim Hamidouche, Jean-Fran\c{c}ois Travers, Naty, Sidaty, Jean-Yves Aubi\'e, Olivier Deforges

PDF

TL;DR

CAESR introduces a hybrid learning-based spatial scalability method combining VVC intra-mode encoding with a deep autoencoder and super-resolution, achieving competitive high-resolution reconstruction with scalable bitstreams.

Contribution

The paper proposes a novel hybrid coding framework that integrates VVC intra-mode with deep autoencoders and super-resolution for scalable video coding.

Findings

01

Competitive performance with VVC full-resolution intra coding

02

Effective spatial scalability through conditional autoencoding

03

Super-resolution module successfully reconstructs high-resolution details

Abstract

In this paper, we present CAESR, an hybrid learning-based coding approach for spatial scalability based on the versatile video coding (VVC) standard. Our framework considers a low-resolution signal encoded with VVC intra-mode as a base-layer (BL), and a deep conditional autoencoder with hyperprior (AE-HP) as an enhancement-layer (EL) model. The EL encoder takes as inputs both the upscaled BL reconstruction and the original image. Our approach relies on conditional coding that learns the optimal mixture of the source and the upscaled BL image, enabling better performance than residual coding. On the decoder side, a super-resolution (SR) module is used to recover high-resolution details and invert the conditional coding process. Experimental results have shown that our solution is competitive with the VVC full-resolution intra coding while being scalable.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.