PM-VIS+: High-Performance Video Instance Segmentation without Video   Annotation

Zhangjing Yang; Dun Liu; Xin Wang; Zhe Li; Barathwaj Anandan; Yi Wu

arXiv:2406.19665·cs.CV·July 1, 2024

PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation

Zhangjing Yang, Dun Liu, Xin Wang, Zhe Li, Barathwaj Anandan, Yi Wu

PDF

Open Access 1 Repo

TL;DR

This paper presents PM-VIS+, a high-performance video instance segmentation method that leverages image datasets and semi-supervised learning to eliminate the need for costly video annotations.

Contribution

It introduces a novel approach that adapts image-based annotations for video segmentation and employs pseudo masks and semi-supervised techniques to improve accuracy without manual video annotations.

Findings

01

Achieves competitive video segmentation performance without video annotations.

02

Utilizes ImageNet-bbox to supplement missing categories in datasets.

03

Employs pseudo masks and semi-supervised optimization for enhanced accuracy.

Abstract

Video instance segmentation requires detecting, segmenting, and tracking objects in videos, typically relying on costly video annotations. This paper introduces a method that eliminates video annotations by utilizing image datasets. The PM-VIS algorithm is adapted to handle both bounding box and instance-level pixel annotations dynamically. We introduce ImageNet-bbox to supplement missing categories in video datasets and propose the PM-VIS+ algorithm to adjust supervision based on annotation types. To enhance accuracy, we use pseudo masks and semi-supervised optimization techniques on unannotated video data. This method achieves high video instance segmentation performance without manual video annotations, offering a cost-effective solution and new perspectives for video instance segmentation applications. The code will be available in https://github.com/ldknight/PM-VIS-plus

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ldknight/pm-vis-plus
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis