FODVid: Flow-guided Object Discovery in Videos
Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Rishabh Jain, and Mayur Hemani, Balaji Krishnamurthy

TL;DR
FODVid introduces a flow-guided, unsupervised video object segmentation pipeline that leverages appearance, flow, and temporal cues, achieving competitive results with a simple yet effective approach.
Contribution
The paper presents a novel, simple pipeline for unsupervised video object segmentation that combines flow-guided graph-cut with temporal consistency, avoiding overfitting to video nuances.
Findings
Achieves results within ~2 mIoU of top methods on DAVIS16.
Demonstrates the effectiveness of flow-guided segmentation in an unsupervised setting.
Highlights the potential of simple methods for competitive video segmentation.
Abstract
Segmentation of objects in a video is challenging due to the nuances such as motion blurring, parallax, occlusions, changes in illumination, etc. Instead of addressing these nuances separately, we focus on building a generalizable solution that avoids overfitting to the individual intricacies. Such a solution would also help us save enormous resources involved in human annotation of video corpora. To solve Video Object Segmentation (VOS) in an unsupervised setting, we propose a new pipeline (FODVid) based on the idea of guiding segmentation outputs using flow-guided graph-cut and temporal consistency. Basically, we design a segmentation model incorporating intra-frame appearance and flow similarities, and inter-frame temporal continuation of the objects under consideration. We perform an extensive experimental analysis of our straightforward methodology on the standard DAVIS16 video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
MethodsFocus · VOS
