Generative Video Matting

Yongtao Ge; Kangyang Xie; Guangkai Xu; Mingyu Liu; Li Ke; Longtao Huang; Hui Xue; Hao Chen; Chunhua Shen

arXiv:2508.07905·cs.CV·August 12, 2025

Generative Video Matting

Yongtao Ge, Kangyang Xie, Guangkai Xu, Mingyu Liu, Li Ke, Longtao Huang, Hui Xue, Hao Chen, Chunhua Shen

PDF

1 Datasets

TL;DR

This paper introduces a new video matting method that leverages large-scale synthetic pre-training and pre-trained diffusion models to improve temporal consistency and generalization in real-world scenarios.

Contribution

It presents a scalable synthetic data generation pipeline and a novel video matting architecture that effectively utilizes priors from pre-trained diffusion models.

Findings

01

Outperforms existing methods on three benchmark datasets.

02

Demonstrates strong generalization to diverse real-world scenes.

03

Achieves high temporal consistency in video matting.

Abstract

Video matting has traditionally been limited by the lack of high-quality ground-truth data. Most existing video matting datasets provide only human-annotated imperfect alpha and foreground annotations, which must be composited to background images or videos during the training stage. Thus, the generalization capability of previous methods in real-world scenarios is typically poor. In this work, we propose to solve the problem from two perspectives. First, we emphasize the importance of large-scale pre-training by pursuing diverse synthetic and pseudo-labeled segmentation datasets. We also develop a scalable synthetic data generation pipeline that can render diverse human bodies and fine-grained hairs, yielding around 200 video clips with a 3-second duration for fine-tuning. Second, we introduce a novel video matting approach that can effectively leverage the rich priors from pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

geyongtao/SynHairMan
dataset· 16 dl
16 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.