DFIMat: Decoupled Flexible Interactive Matting in Multi-Person Scenarios

Siyi Jiao; Wenzheng Zeng; Changxin Gao; Nong Sang

arXiv:2410.09788·cs.CV·October 15, 2024

DFIMat: Decoupled Flexible Interactive Matting in Multi-Person Scenarios

Siyi Jiao, Wenzheng Zeng, Changxin Gao, Nong Sang

PDF

Open Access 1 Repo

TL;DR

DFIMat introduces a decoupled, flexible interactive portrait matting framework that improves interpretability, multi-input handling, and multi-round refinement, supported by a new synthetic dataset and extensive experiments.

Contribution

The paper proposes a novel decoupled framework for interactive portrait matting, addressing limitations of existing methods and introducing a new synthetic dataset for multi-person scenarios.

Findings

01

Decoupling sub-tasks improves performance and learning ease.

02

Flexible multi-type inputs enhance effectiveness and efficiency.

03

DFIMat outperforms existing methods in experiments.

Abstract

Interactive portrait matting refers to extracting the soft portrait from a given image that best meets the user's intent through their inputs. Existing methods often underperform in complex scenarios, mainly due to three factors. (1) Most works apply a tightly coupled network that directly predicts matting results, lacking interpretability and resulting in inadequate modeling. (2) Existing works are limited to a single type of user input, which is ineffective for intention understanding and also inefficient for user operation. (3) The multi-round characteristics have been under-explored, which is crucial for user interaction. To alleviate these limitations, we propose DFIMat, a decoupled framework that enables flexible interactive matting. Specifically, we first decouple the task into 2 sub-ones: localizing target instances by understanding scene semantics and the flexible user inputs,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiaosiyi/dfimat
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Video Analysis and Summarization · Speech and dialogue systems