FlowVOS: Weakly-Supervised Visual Warping for Detail-Preserving and Temporally Consistent Single-Shot Video Object Segmentation
Julia Gong, F. Christopher Holsinger, Serena Yeung

TL;DR
FlowVOS introduces a weakly-supervised, foreground-focused visual warping method for semi-supervised video object segmentation, achieving high detail preservation and temporal consistency without extra flow supervision.
Contribution
The paper proposes a novel foreground-targeted visual warping approach that learns flow fields from VOS data, improving detail and temporal consistency in segmentation.
Findings
Outperforms state-of-the-art offline methods on DAVIS17 and YouTubeVOS benchmarks.
Achieves high detail and temporal consistency in segmentation results.
Operates with fast runtimes without requiring additional flow supervision.
Abstract
We consider the task of semi-supervised video object segmentation (VOS). Our approach mitigates shortcomings in previous VOS work by addressing detail preservation and temporal consistency using visual warping. In contrast to prior work that uses full optical flow, we introduce a new foreground-targeted visual warping approach that learns flow fields from VOS data. We train a flow module to capture detailed motion between frames using two weakly-supervised losses. Our object-focused approach of warping previous foreground object masks to their positions in the target frame enables detailed mask refinement with fast runtimes without using extra flow supervision. It can also be integrated directly into state-of-the-art segmentation networks. On the DAVIS17 and YouTubeVOS benchmarks, we outperform state-of-the-art offline methods that do not use extra data, as well as many online methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsVOS
