A Dynamic Multi-Scale Voxel Flow Network for Video Prediction

Xiaotao Hu; Zhewei Huang; Ailin Huang; Jun Xu; Shuchang Zhou

arXiv:2303.09875·cs.CV·March 27, 2023·6 cites

A Dynamic Multi-Scale Voxel Flow Network for Video Prediction

Xiaotao Hu, Zhewei Huang, Ailin Huang, Jun Xu, Shuchang Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces DMVFN, a neural network that efficiently predicts future video frames using only RGB images, by adaptively selecting sub-networks based on motion scales, achieving high quality with lower computational costs.

Contribution

The paper presents a novel differentiable routing module enabling adaptive sub-network selection for improved video prediction efficiency and quality using only RGB inputs.

Findings

01

DMVFN is an order of magnitude faster than Deep Voxel Flow.

02

DMVFN surpasses state-of-the-art OPT in generated image quality.

03

The method achieves comparable or better performance with lower computational costs.

Abstract

The performance of video prediction has been greatly boosted by advanced deep neural networks. However, most of the current methods suffer from large model sizes and require extra inputs, e.g., semantic/depth maps, for promising performance. For efficiency consideration, in this paper, we propose a Dynamic Multi-scale Voxel Flow Network (DMVFN) to achieve better video prediction performance at lower computational costs with only RGB images, than previous methods. The core of our DMVFN is a differentiable routing module that can effectively perceive the motion scales of video frames. Once trained, our DMVFN selects adaptive sub-networks for different inputs at the inference stage. Experiments on several benchmarks demonstrate that our DMVFN is an order of magnitude faster than Deep Voxel Flow and surpasses the state-of-the-art iterative-based OPT on generated image quality. Our code and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

megvii-research/CVPR2023-DMVFN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition

MethodsOPT · A Dynamic Multi-Scale Voxel Flow Network