Perceptual Learned Video Compression with Recurrent Conditional GAN
Ren Yang, Radu Timofte, Luc Van Gool

TL;DR
This paper introduces a perceptual learned video compression method using a recurrent auto-encoder and a novel recurrent conditional GAN discriminator to achieve high perceptual quality and temporal consistency at low bit-rates.
Contribution
It presents a new perceptual video compression framework with a recurrent conditional discriminator that enhances spatial and temporal realism in compressed videos.
Findings
Outperforms HEVC test model and existing learned approaches in perceptual quality metrics.
Achieves good perceptual quality at low bit-rates.
Demonstrates temporal consistency and realism in compressed videos.
Abstract
This paper proposes a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional GAN. We employ the recurrent auto-encoder-based compression network as the generator, and most importantly, we propose a recurrent conditional discriminator, which judges on raw vs. compressed video conditioned on both spatial and temporal features, including the latent representation, temporal motion and hidden states in recurrent cells. This way, the adversarial training pushes the generated video to be not only spatially photo-realistic but also temporally consistent with the groundtruth and coherent among video frames. The experimental results show that the learned PLVC model compresses video with good perceptual quality at low bit-rate, and that it outperforms the official HEVC test model (HM 16.20) and the existing learned video compression approaches for several perceptual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis
