TL;DR
This paper introduces a recurrent auto-encoder and probability model for video compression, leveraging extensive temporal information to improve compression efficiency and outperform existing methods in PSNR and MS-SSIM.
Contribution
It proposes a novel recurrent learned video compression framework that utilizes recurrent cells and probability modeling to better exploit temporal correlations.
Findings
Achieves state-of-the-art compression performance in PSNR and MS-SSIM.
Outperforms x265's Low-Delay P setting in PSNR.
Provides code implementation for reproducibility.
Abstract
The past few years have witnessed increasing interests in applying deep learning to video compression. However, the existing approaches compress a video frame with only a few number of reference frames, which limits their ability to fully exploit the temporal correlation among video frames. To overcome this shortcoming, this paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model (RPM). Specifically, the RAE employs recurrent cells in both the encoder and decoder. As such, the temporal information in a large range of frames can be used for generating latent representations and reconstructing compressed outputs. Furthermore, the proposed RPM network recurrently estimates the Probability Mass Function (PMF) of the latent representation, conditioned on the distribution of previous latent representations. Due…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRegularized Autoencoders
