Spatiotemporal Entropy Model is All You Need for Learned Video   Compression

Zhenhong Sun; Zhiyu Tan; Xiuyu Sun; Fangyi Zhang; Dongyang Li; Yichen; Qian; Hao Li

arXiv:2104.06083·eess.IV·April 14, 2021·6 cites

Spatiotemporal Entropy Model is All You Need for Learned Video Compression

Zhenhong Sun, Zhiyu Tan, Xiuyu Sun, Fangyi Zhang, Dongyang Li, Yichen, Qian, Hao Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simplified learned video compression framework that directly compresses raw frames using a unified auto-encoder and a spatiotemporal entropy model, outperforming existing methods in MS-SSIM.

Contribution

It proposes a novel, simplified framework for learned video compression that eliminates motion prediction modules and uses a spatiotemporal entropy model for better efficiency.

Findings

01

Outperforms state-of-the-art in MS-SSIM metric

02

Achieves competitive results in PSNR

03

Reduces framework complexity significantly

Abstract

The framework of dominant learned video compression methods is usually composed of motion prediction modules as well as motion vector and residual image compression modules, suffering from its complex structure and error propagation problem. Approaches have been proposed to reduce the complexity by replacing motion prediction modules with implicit flow networks. Error propagation aware training strategy is also proposed to alleviate incremental reconstruction errors from previously decoded frames. Although these methods have brought some improvement, little attention has been paid to the framework itself. Inspired by the success of learned image compression through simplifying the framework with a single deep neural network, it is natural to expect a better performance in video compression via a simple yet appropriate framework. Therefore, we propose a framework to directly compress…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tinyvision/IPCodec
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Advanced Image Processing Techniques · Image and Signal Denoising Methods

MethodsAttentive Walk-Aggregating Graph Neural Network