PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Zhangjing Yang, Dun Liu, Wensheng Cheng, Jinqiao Wang, Yi Wu

TL;DR
This paper introduces PM-VIS, a high-performance box-supervised video instance segmentation method that leverages pseudo masks generated from multiple models and filtering techniques to achieve state-of-the-art results, reducing annotation effort.
Contribution
It proposes a novel approach combining high-quality pseudo masks and a new algorithm to significantly improve box-supervised video instance segmentation performance.
Findings
Achieves state-of-the-art results on YouTube-VIS and OVIS datasets.
Narrowed the performance gap between box-supervised and fully supervised methods.
Demonstrates the effectiveness of pseudo masks and filtering in training.
Abstract
Labeling pixel-wise object masks in videos is a resource-intensive and laborious process. Box-supervised Video Instance Segmentation (VIS) methods have emerged as a viable solution to mitigate the labor-intensive annotation process. . In practical applications, the two-step approach is not only more flexible but also exhibits a higher recognition accuracy. Inspired by the recent success of Segment Anything Model (SAM), we introduce a novel approach that aims at harnessing instance box annotations from multiple perspectives to generate high-quality instance pseudo masks, thus enriching the information contained in instance annotations. We leverage ground-truth boxes to create three types of pseudo masks using the HQ-SAM model, the box-supervised VIS model (IDOL-BoxInst), and the VOS model (DeAOT) separately, along with three corresponding optimization mechanisms. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Video Surveillance and Tracking Methods · Image and Video Quality Assessment
MethodsVOS
