One-shot Training for Video Object Segmentation

Baiyu Chen; Sixian Chan; Xiaoqin Zhang

arXiv:2405.14010·cs.CV·May 24, 2024

One-shot Training for Video Object Segmentation

Baiyu Chen, Sixian Chan, Xiaoqin Zhang

PDF

Open Access

TL;DR

This paper introduces a novel one-shot training framework for video object segmentation that requires only a single labeled frame per training video, significantly reducing annotation effort while maintaining competitive performance.

Contribution

It presents the first one-shot training method for VOS, utilizing bi-directional mask inference and reconstruction, applicable to most state-of-the-art VOS networks.

Findings

01

Achieves comparable results to fully supervised methods using only one labeled frame.

02

Simple end-to-end approach that is easy to implement.

03

Reduces annotation cost significantly while maintaining performance.

Abstract

Video Object Segmentation (VOS) aims to track objects across frames in a video and segment them based on the initial annotated frame of the target objects. Previous VOS works typically rely on fully annotated videos for training. However, acquiring fully annotated training videos for VOS is labor-intensive and time-consuming. Meanwhile, self-supervised VOS methods have attempted to build VOS systems through correspondence learning and label propagation. Still, the absence of mask priors harms their robustness to complex scenarios, and the label propagation paradigm makes them impractical in terms of efficiency. To address these issues, we propose, for the first time, a general one-shot training framework for VOS, requiring only a single labeled frame per training video and applicable to a majority of state-of-the-art VOS networks. Specifically, our algorithm consists of: i) Inferring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques

MethodsVOS