Contextual Guided Segmentation Framework for Semi-supervised Video   Instance Segmentation

Trung-Nghia Le; Tam V. Nguyen; Minh-Triet Tran

arXiv:2106.03330·cs.CV·April 12, 2022

Contextual Guided Segmentation Framework for Semi-supervised Video Instance Segmentation

Trung-Nghia Le, Tam V. Nguyen, Minh-Triet Tran

PDF

TL;DR

This paper introduces a comprehensive semi-supervised video instance segmentation framework that combines instance re-identification, contextual cues, and guided attention to improve segmentation accuracy across challenging scenarios.

Contribution

The proposed CGS framework integrates multiple passes and novel techniques for improved video instance segmentation, including re-identification flow, skeleton-guided segmentation, and ROI-based fine segmentation.

Findings

01

Achieved top performance in DAVIS Challenge with over 75% in global score.

02

Demonstrated effective handling of occlusion, deformation, and re-appearance.

03

Outperformed previous methods on benchmark datasets.

Abstract

In this paper, we propose Contextual Guided Segmentation (CGS) framework for video instance segmentation in three passes. In the first pass, i.e., preview segmentation, we propose Instance Re-Identification Flow to estimate main properties of each instance (i.e., human/non-human, rigid/deformable, known/unknown category) by propagating its preview mask to other frames. In the second pass, i.e., contextual segmentation, we introduce multiple contextual segmentation schemes. For human instance, we develop skeleton-guided segmentation in a frame along with object flow to correct and refine the result across frames. For non-human instance, if the instance has a wide variation in appearance and belongs to known categories (which can be inferred from the initial mask), we adopt instance segmentation. If the non-human instance is nearly rigid, we train FCNs on synthesized images from the first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.