ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation
Duolikun Danier, Fan Zhang, David Bull

TL;DR
ST-MFNet is a novel deep learning framework for video frame interpolation that effectively handles large motions, occlusions, and dynamic textures by combining multi-flow prediction, 3D CNN modeling, and perceptual training.
Contribution
The paper introduces ST-MFNet, a multi-flow spatio-temporal network with a multi-scale predictor and 3D CNN, trained within an ST-GAN framework, advancing the state-of-the-art in VFI performance.
Findings
Outperforms 14 state-of-the-art VFI algorithms.
Achieves up to 1.09dB higher PSNR on challenging datasets.
Effectively handles large motions and dynamic textures.
Abstract
Video frame interpolation (VFI) is currently a very active research topic, with applications spanning computer vision, post production and video encoding. VFI can be extremely challenging, particularly in sequences containing large motions, occlusions or dynamic textures, where existing approaches fail to offer perceptually robust interpolation performance. In this context, we present a novel deep learning based VFI method, ST-MFNet, based on a Spatio-Temporal Multi-Flow architecture. ST-MFNet employs a new multi-scale multi-flow predictor to estimate many-to-one intermediate flows, which are combined with conventional one-to-one optical flows to capture both large and complex motions. In order to enhance interpolation performance for various textures, a 3D CNN is also employed to model the content dynamics over an extended temporal window. Moreover, ST-MFNet has been trained within an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies
Methods3 Dimensional Convolutional Neural Network
