Automatic Real-time Background Cut for Portrait Videos
Xiaoyong Shen, Ruixing Wang, Hengshuang Zhao, Jiaya Jia

TL;DR
This paper presents an end-to-end real-time portrait video segmentation framework that combines a global background attenuation model with a spatial-temporal refinement network, achieving high-quality background removal for 720p videos.
Contribution
It introduces a novel global background attenuation model and a spatial-temporal refinement network for efficient, high-quality real-time background segmentation in portrait videos.
Findings
Achieves high-quality background removal at 720p in real-time
Develops a new portrait dataset with 8,000 images
Fine-tunes on a 50-sequence portrait video dataset
Abstract
We in this paper solve the problem of high-quality automatic real-time background cut for 720p portrait videos. We first handle the background ambiguity issue in semantic segmentation by proposing a global background attenuation model. A spatial-temporal refinement network is developed to further refine the segmentation errors in each frame and ensure temporal coherence in the segmentation map. We form an end-to-end network for training and testing. Each module is designed considering efficiency and accuracy. We build a portrait dataset, which includes 8,000 images with high-quality labeled map for training and testing. To further improve the performance, we build a portrait video dataset with 50 sequences to fine-tune video segmentation. Our framework benefits many video processing applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Visual Attention and Saliency Detection · Advanced Image Processing Techniques
