Iteratively Selecting an Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier
Youngjo Lee, Hongje Seong, Euntai Kim

TL;DR
This paper introduces a novel iterative framework that selects easier reference frames for unsupervised video object segmentation, significantly improving performance by adaptively choosing better reference frames than traditional methods.
Contribution
The paper proposes Easy Frame Selector (EFS) and Iterative Mask Prediction (IMP), a new framework that enhances UVOS by iteratively selecting easier reference frames for better segmentation accuracy.
Findings
Achieves state-of-the-art results on DAVIS16, FBMS, and SegTrack-V2 datasets.
Improves UVOS performance by adaptively selecting easier reference frames.
Demonstrates the effectiveness of iterative frame selection in unsupervised video segmentation.
Abstract
Unsupervised video object segmentation (UVOS) is a per-pixel binary labeling problem which aims at separating the foreground object from the background in the video without using the ground truth (GT) mask of the foreground object. Most of the previous UVOS models use the first frame or the entire video as a reference frame to specify the mask of the foreground object. Our question is why the first frame should be selected as a reference frame or why the entire video should be used to specify the mask. We believe that we can select a better reference frame to achieve the better UVOS performance than using only the first frame or the entire video as a reference frame. In our paper, we propose Easy Frame Selector (EFS). The EFS enables us to select an 'easy' reference frame that makes the subsequent VOS become easy, thereby improving the VOS performance. Furthermore, we propose a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
MethodsVOS
