Weakly Supervised Instance Segmentation for Videos with Temporal Mask   Consistency

Qing Liu; Vignesh Ramanathan; Dhruv Mahajan; Alan Yuille; Zhenheng; Yang

arXiv:2103.12886·cs.CV·March 25, 2021·1 cites

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng, Yang

PDF

Open Access

TL;DR

This paper introduces a novel approach for weakly supervised video instance segmentation by leveraging temporal consistency and motion cues, significantly improving segmentation accuracy over image-based methods.

Contribution

It is the first to utilize video signals for weakly supervised instance segmentation, proposing two methods to incorporate motion and temporal consistency into training.

Findings

01

Improved $AP_{50}$ by 5% on Youtube-VIS dataset.

02

Enhanced $AP_{50}$ by 3% on Cityscapes dataset.

03

Demonstrated effectiveness of temporal cues in weakly supervised segmentation.

Abstract

Weakly supervised instance segmentation reduces the cost of annotations required to train models. However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions. We show that these issues can be better addressed by training with weakly labeled videos instead of images. In videos, motion and temporal consistency of predictions across frames provide complementary signals which can help segmentation. We are the first to explore the use of these video signals to tackle weakly supervised instance segmentation. We propose two ways to leverage this information in our model. First, we adapt inter-pixel relation network (IRN) to effectively incorporate motion information during training. Second, we introduce a new MaskConsist module, which addresses the problem of missing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning