The Runner-up Solution for YouTube-VIS Long Video Challenge 2022
Junfeng Wu, Yi Jiang, Qihao Liu, Xiang Bai, Song Bai

TL;DR
This paper presents a second-place solution for the YouTube-VIS Long Video Challenge 2022, enhancing video instance segmentation with pseudo labels and contrastive learning for better temporal consistency.
Contribution
It introduces the use of pseudo labels with contrastive learning to improve temporal consistency in video instance segmentation.
Findings
Achieved 40.2 AP on the YouTube-VIS 2022 dataset.
Ranked second in the ECCV 2022 challenge.
Demonstrated effectiveness of pseudo labels in improving tracking.
Abstract
This technical report describes our 2nd-place solution for the ECCV 2022 YouTube-VIS Long Video Challenge. We adopt the previously proposed online video instance segmentation method IDOL for this challenge. In addition, we use pseudo labels to further help contrastive learning, so as to obtain more temporally consistent instance embedding to improve tracking performance between frames. The proposed method obtains 40.2 AP on the YouTube-VIS 2022 long video dataset and was ranked second place in this challenge. We hope our simple and effective method could benefit further research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
