Virtual Multi-Modality Self-Supervised Foreground Matting for   Human-Object Interaction

Bo Xu; Han Huang; Cheng Lu; Ziwen Li; Yandong Guo

arXiv:2110.03278·cs.CV·October 25, 2021

Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction

Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo

PDF

Open Access 1 Repo

TL;DR

This paper introduces a self-supervised multi-modality approach for human-object interaction foreground matting from RGB images, eliminating the need for additional inputs like trimaps or backgrounds.

Contribution

It reformulates foreground matting as a self-supervised multi-modality problem and proposes a novel Complementary Learning method to improve accuracy without labeled data.

Findings

01

Outperforms state-of-the-art methods in human-object foreground matting.

02

Effectively utilizes depth, segmentation, and interaction heatmap modalities.

03

Self-supervised learning reduces reliance on labeled training data.

Abstract

Most existing human matting algorithms tried to separate pure human-only foreground from the background. In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image. The VMFM method requires no additional inputs, e.g. trimap or known background. We reformulate foreground matting as a self-supervised multi-modality problem: factor each input image into estimated depth map, segmentation mask, and interaction heatmap using three auto-encoders. In order to fully utilize the characteristics of each modality, we first train a dual encoder-to-decoder network to estimate the same alpha matte. Then we introduce a self-supervised method: Complementary Learning(CL) to predict deviation probability map and exchange reliable gradients across modalities without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jacksyu/hoi-matting
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Advanced Vision and Imaging · Visual Attention and Saliency Detection

MethodsHeatmap