The Runner-up Solution for YouTube-VIS Long Video Challenge 2022

Junfeng Wu; Yi Jiang; Qihao Liu; Xiang Bai; Song Bai

arXiv:2211.09973·cs.CV·November 21, 2022

The Runner-up Solution for YouTube-VIS Long Video Challenge 2022

Junfeng Wu, Yi Jiang, Qihao Liu, Xiang Bai, Song Bai

PDF

Open Access

TL;DR

This paper presents a second-place solution for the YouTube-VIS Long Video Challenge 2022, enhancing video instance segmentation with pseudo labels and contrastive learning for better temporal consistency.

Contribution

It introduces the use of pseudo labels with contrastive learning to improve temporal consistency in video instance segmentation.

Findings

01

Achieved 40.2 AP on the YouTube-VIS 2022 dataset.

02

Ranked second in the ECCV 2022 challenge.

03

Demonstrated effectiveness of pseudo labels in improving tracking.

Abstract

This technical report describes our 2nd-place solution for the ECCV 2022 YouTube-VIS Long Video Challenge. We adopt the previously proposed online video instance segmentation method IDOL for this challenge. In addition, we use pseudo labels to further help contrastive learning, so as to obtain more temporally consistent instance embedding to improve tracking performance between frames. The proposed method obtains 40.2 AP on the YouTube-VIS 2022 long video dataset and was ranked second place in this challenge. We hope our simple and effective method could benefit further research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Video Surveillance and Tracking Methods · Human Pose and Action Recognition