STIP: A SpatioTemporal Information-Preserving and Perception-Augmented   Model for High-Resolution Video Prediction

Zheng Chang; Xinfeng Zhang; Shanshe Wang; Siwei Ma; and Wen Gao

arXiv:2206.04381·cs.CV·June 10, 2022·6 cites

STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction

Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces STIP, a novel high-resolution video prediction model that preserves spatiotemporal information and enhances perceptual quality using a multi-grained auto-encoder, a spatiotemporal GRU, and a GAN-based perceptual loss.

Contribution

The paper proposes a new model combining MGST-AE, STGRU, and a perceptual loss to improve high-resolution video prediction performance.

Findings

01

Outperforms state-of-the-art methods in visual quality

02

Preserves more spatiotemporal information during feature extraction

03

Achieves better perceptual quality with lower computational load

Abstract

Although significant achievements have been achieved by recurrent neural network (RNN) based video prediction methods, their performance in datasets with high resolutions is still far from satisfactory because of the information loss problem and the perception-insensitive mean square error (MSE) based loss functions. In this paper, we propose a Spatiotemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems. To solve the information loss problem, the proposed model aims to preserve the spatiotemporal information for videos during the feature extraction and the state transitions, respectively. Firstly, a Multi-Grained Spatiotemporal Auto-Encoder (MGST-AE) is designed based on the X-Net structure. The proposed MGST-AE can help the decoders recall multi-grained information from the encoders in both the temporal and spatial domains. In this way,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhengchang467/stiphr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging · Image and Video Quality Assessment