Spatiotemporal Entropy Model is All You Need for Learned Video Compression
Zhenhong Sun, Zhiyu Tan, Xiuyu Sun, Fangyi Zhang, Dongyang Li, Yichen, Qian, Hao Li

TL;DR
This paper introduces a simplified learned video compression framework that directly compresses raw frames using a unified auto-encoder and a spatiotemporal entropy model, outperforming existing methods in MS-SSIM.
Contribution
It proposes a novel, simplified framework for learned video compression that eliminates motion prediction modules and uses a spatiotemporal entropy model for better efficiency.
Findings
Outperforms state-of-the-art in MS-SSIM metric
Achieves competitive results in PSNR
Reduces framework complexity significantly
Abstract
The framework of dominant learned video compression methods is usually composed of motion prediction modules as well as motion vector and residual image compression modules, suffering from its complex structure and error propagation problem. Approaches have been proposed to reduce the complexity by replacing motion prediction modules with implicit flow networks. Error propagation aware training strategy is also proposed to alleviate incremental reconstruction errors from previously decoded frames. Although these methods have brought some improvement, little attention has been paid to the framework itself. Inspired by the success of learned image compression through simplifying the framework with a single deep neural network, it is natural to expect a better performance in video compression via a simple yet appropriate framework. Therefore, we propose a framework to directly compress…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Advanced Image Processing Techniques · Image and Signal Denoising Methods
MethodsAttentive Walk-Aggregating Graph Neural Network
