Less than Few: Self-Shot Video Instance Segmentation
Pengwan Yang, Yuki M. Asano, Pascal Mettes, and Cees G. M. Snoek

TL;DR
This paper introduces a self-shot learning approach for video instance segmentation that automatically finds relevant support videos without manual labels, using self-supervised embeddings, and demonstrates competitive performance with few-shot methods.
Contribution
It proposes the first self-shot learning framework for video instance segmentation, leveraging self-supervised embeddings for unsupervised support retrieval and surpassing some few-shot methods.
Findings
Self-shot learning can outperform traditional few-shot approaches.
The method scales effectively to large unlabelled video collections.
Combining self-shot with semi-supervised learning improves results.
Abstract
The goal of this paper is to bypass the need for labelled examples in few-shot video understanding at run time. While proven effective, in many practical video settings even labelling a few examples appears unrealistic. This is especially true as the level of details in spatio-temporal video understanding and with it, the complexity of annotations continues to increase. Rather than performing few-shot learning with a human oracle to provide a few densely labelled support videos, we propose to automatically learn to find appropriate support videos given a query. We call this self-shot learning and we outline a simple self-supervised learning method to generate an embedding space well-suited for unsupervised retrieval of relevant samples. To showcase this novel setting, we tackle, for the first time, video instance segmentation in a self-shot (and few-shot) setting, where the goal is to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
