Video Instance Segmentation with a Propose-Reduce Paradigm

Huaijia Lin; Ruizheng Wu; Shu Liu; Jiangbo Lu; Jiaya Jia

arXiv:2103.13746·cs.CV·October 1, 2021

Video Instance Segmentation with a Propose-Reduce Paradigm

Huaijia Lin, Ruizheng Wu, Shu Liu, Jiangbo Lu, Jiaya Jia

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Propose-Reduce paradigm for video instance segmentation, enabling complete sequence generation in a single step and improving robustness and recall, leading to state-of-the-art results.

Contribution

The paper proposes a novel Propose-Reduce paradigm and a sequence propagation head for improved long-term video instance segmentation.

Findings

01

Achieved 47.6% AP on YouTube-VIS dataset.

02

Achieved 70.4% J&F on DAVIS-UVOS dataset.

03

Outperforms previous methods with state-of-the-art results.

Abstract

Video instance segmentation (VIS) aims to segment and associate all instances of predefined classes for each frame in videos. Prior methods usually obtain segmentation for a frame or clip first, and merge the incomplete results by tracking or matching. These methods may cause error accumulation in the merging step. Contrarily, we propose a new paradigm -- Propose-Reduce, to generate complete sequences for input videos by a single step. We further build a sequence propagation head on the existing image-level instance segmentation network for long-term propagation. To ensure robustness and high recall of our proposed framework, multiple sequences are proposed where redundant sequences of the same instance are reduced. We achieve state-of-the-art performance on two representative benchmark datasets -- we obtain 47.6% in terms of AP on YouTube-VIS validation set and 70.4% for J&F on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dvlab-research/proposereduce
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging

MethodsRegion Proposal Network · RoIAlign · Softmax · Convolution · Mask R-CNN