ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth   Adaptation

Fan Zhang; Mariana Afonso; David R. Bull

arXiv:1911.02833·eess.IV·June 22, 2021

ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth Adaptation

Fan Zhang, Mariana Afonso, David R. Bull

PDF

TL;DR

ViSTRA2 introduces a novel video compression method that adaptively adjusts spatial resolution and bit depth, utilizing deep learning for up-sampling, leading to significant compression improvements over existing standards.

Contribution

The paper presents a new framework integrating adaptive resolution and bit depth with neural up-sampling, enhancing compression efficiency in standard video codecs.

Findings

01

Achieves 12.6% BD-rate reduction in PSNR over HEVC

02

Achieves 19.5% BD-rate reduction in VMAF over HEVC

03

Demonstrates consistent improvements over VVC and HEVC standards

Abstract

We present a new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampling these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network. ViSTRA2 has been integrated with the reference software of both the HEVC (HM 16.20) and VVC (VTM 4.01), and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. Our results show consistent and significant compression gains against HM and VVC based on Bj{\o}negaard Delta measurements, with average BD-rate savings of 12.6% (PSNR) and 19.5% (VMAF) over HM and 5.5% (PSNR) and 8.6% (VMAF) over VTM.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest · Convolution