Less than Few: Self-Shot Video Instance Segmentation

Pengwan Yang; Yuki M. Asano; Pascal Mettes; and Cees G. M. Snoek

arXiv:2204.08874·cs.CV·April 20, 2022

Less than Few: Self-Shot Video Instance Segmentation

Pengwan Yang, Yuki M. Asano, Pascal Mettes, and Cees G. M. Snoek

PDF

Open Access

TL;DR

This paper introduces a self-shot learning approach for video instance segmentation that automatically finds relevant support videos without manual labels, using self-supervised embeddings, and demonstrates competitive performance with few-shot methods.

Contribution

It proposes the first self-shot learning framework for video instance segmentation, leveraging self-supervised embeddings for unsupervised support retrieval and surpassing some few-shot methods.

Findings

01

Self-shot learning can outperform traditional few-shot approaches.

02

The method scales effectively to large unlabelled video collections.

03

Combining self-shot with semi-supervised learning improves results.

Abstract

The goal of this paper is to bypass the need for labelled examples in few-shot video understanding at run time. While proven effective, in many practical video settings even labelling a few examples appears unrealistic. This is especially true as the level of details in spatio-temporal video understanding and with it, the complexity of annotations continues to increase. Rather than performing few-shot learning with a human oracle to provide a few densely labelled support videos, we propose to automatically learn to find appropriate support videos given a query. We call this self-shot learning and we outline a simple self-supervised learning method to generate an embedding space well-suited for unsupervised retrieval of relevant samples. To showcase this novel setting, we tackle, for the first time, video instance segmentation in a self-shot (and few-shot) setting, where the goal is to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques