ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth Adaptation
Fan Zhang, Mariana Afonso, David R. Bull

TL;DR
ViSTRA2 introduces a novel video compression method that adaptively adjusts spatial resolution and bit depth, utilizing deep learning for up-sampling, leading to significant compression improvements over existing standards.
Contribution
The paper presents a new framework integrating adaptive resolution and bit depth with neural up-sampling, enhancing compression efficiency in standard video codecs.
Findings
Achieves 12.6% BD-rate reduction in PSNR over HEVC
Achieves 19.5% BD-rate reduction in VMAF over HEVC
Demonstrates consistent improvements over VVC and HEVC standards
Abstract
We present a new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampling these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network. ViSTRA2 has been integrated with the reference software of both the HEVC (HM 16.20) and VVC (VTM 4.01), and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. Our results show consistent and significant compression gains against HM and VVC based on Bj{\o}negaard Delta measurements, with average BD-rate savings of 12.6% (PSNR) and 19.5% (VMAF) over HM and 5.5% (PSNR) and 8.6% (VMAF) over VTM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · Convolution
