FODVid: Flow-guided Object Discovery in Videos

Silky Singh; Shripad Deshmukh; Mausoom Sarkar; Rishabh Jain; and Mayur Hemani; Balaji Krishnamurthy

arXiv:2307.04392·cs.CV·July 11, 2023

FODVid: Flow-guided Object Discovery in Videos

Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Rishabh Jain, and Mayur Hemani, Balaji Krishnamurthy

PDF

Open Access

TL;DR

FODVid introduces a flow-guided, unsupervised video object segmentation pipeline that leverages appearance, flow, and temporal cues, achieving competitive results with a simple yet effective approach.

Contribution

The paper presents a novel, simple pipeline for unsupervised video object segmentation that combines flow-guided graph-cut with temporal consistency, avoiding overfitting to video nuances.

Findings

01

Achieves results within ~2 mIoU of top methods on DAVIS16.

02

Demonstrates the effectiveness of flow-guided segmentation in an unsupervised setting.

03

Highlights the potential of simple methods for competitive video segmentation.

Abstract

Segmentation of objects in a video is challenging due to the nuances such as motion blurring, parallax, occlusions, changes in illumination, etc. Instead of addressing these nuances separately, we focus on building a generalizable solution that avoids overfitting to the individual intricacies. Such a solution would also help us save enormous resources involved in human annotation of video corpora. To solve Video Object Segmentation (VOS) in an unsupervised setting, we propose a new pipeline (FODVid) based on the idea of guiding segmentation outputs using flow-guided graph-cut and temporal consistency. Basically, we design a segmentation model incorporating intra-frame appearance and flow similarities, and inter-frame temporal continuation of the objects under consideration. We perform an extensive experimental analysis of our straightforward methodology on the standard DAVIS16 video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications

MethodsFocus · VOS