Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation
Suhwan Cho, Minhyeok Lee, Jungho Lee, MyeongAh Cho, Seungwook Park,, Jaeyeob Kim, Hyunsung Jang, Sangyoun Lee

TL;DR
This paper introduces a novel approach for unsupervised video object segmentation that treats motion cues as optional, improving robustness by reducing over-reliance on motion during training and adaptively selecting the best output during testing.
Contribution
The paper proposes a motion-as-option network with adaptive output selection, enabling flexible use of motion cues and enhancing segmentation stability without external guidance.
Findings
Reduces dependency on motion cues during training.
Improves segmentation stability in challenging scenarios.
Achieves better performance with adaptive output selection.
Abstract
Unsupervised video object segmentation aims to detect the most salient object in a video without any external guidance regarding the object. Salient objects often exhibit distinctive movements compared to the background, and recent methods leverage this by combining motion cues from optical flow maps with appearance cues from RGB images. However, because optical flow maps are often closely correlated with segmentation masks, networks can become overly dependent on motion cues during training, leading to vulnerability when faced with confusing motion cues and resulting in unstable predictions. To address this challenge, we propose a novel motion-as-option network that treats motion cues as an optional component rather than a necessity. During training, we randomly input RGB images into the motion encoder instead of optical flow maps, which implicitly reduces the network's reliance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
