Deep Video Matting via Spatio-Temporal Alignment and Aggregation

Yanan Sun; Guanzhi Wang; Qiao Gu; Chi-Keung Tang; Yu-Wing Tai

arXiv:2104.11208·cs.CV·April 23, 2021

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai

PDF

Open Access 1 Repo

TL;DR

This paper introduces a deep learning framework for video matting that uses spatio-temporal feature aggregation and a new dataset, significantly improving performance over existing methods.

Contribution

It proposes a novel spatio-temporal feature aggregation module and a lightweight trimap propagation network, along with a large-scale dataset for training and evaluation.

Findings

01

Outperforms traditional video and image matting methods

02

Effectively aligns and aggregates information across frames

03

Demonstrates significant accuracy improvements

Abstract

Despite the significant progress made by deep learning in natural image matting, there has been so far no representative work on deep learning for video matting due to the inherent technical challenges in reasoning temporal domain and lack of large-scale video matting datasets. In this paper, we propose a deep learning-based video matting framework which employs a novel and effective spatio-temporal feature aggregation module (ST-FAM). As optical flow estimation can be very unreliable within matting regions, ST-FAM is designed to effectively align and aggregate information across different spatial scales and temporal frames within the network decoder. To eliminate frame-by-frame trimap annotations, a lightweight interactive trimap propagation network is also introduced. The other contribution consists of a large-scale video matting dataset with groundtruth alpha mattes for quantitative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nowsyn/DVM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Advanced Image Processing Techniques · Advanced Vision and Imaging