MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition
Zhanghan Ke, Jiayu Sun, Kaican Li, Qiong Yan, Rynson W.H. Lau

TL;DR
MODNet is a lightweight, real-time portrait matting network that eliminates the need for auxiliary inputs, using explicit constraints and novel modules to achieve high accuracy and efficiency suitable for practical applications.
Contribution
The paper introduces MODNet, a novel trimap-free portrait matting model that employs objective decomposition, an efficient ASPP module, and a self-supervised strategy for domain adaptation.
Findings
Runs at 67 FPS on a 1080Ti GPU
Outperforms previous trimap-free methods on benchmark datasets
Effective in real-world portrait and video applications
Abstract
Existing portrait matting methods either require auxiliary inputs that are costly to obtain or involve multiple stages that are computationally expensive, making them less suitable for real-time applications. In this work, we present a light-weight matting objective decomposition network (MODNet) for portrait matting in real-time with a single input image. The key idea behind our efficient design is by optimizing a series of sub-objectives simultaneously via explicit constraints. In addition, MODNet includes two novel techniques for improving model efficiency and robustness. First, an Efficient Atrous Spatial Pyramid Pooling (e-ASPP) module is introduced to fuse multi-scale features for semantic estimation. Second, a self-supervised sub-objectives consistency (SOC) strategy is proposed to adapt MODNet to real-world data to address the domain shift problem common to trimap-free methods.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsImage Enhancement Techniques · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
MethodsMODNet
