A Coarse-To-Fine Framework For Video Object Segmentation
Chi Zhang, Alexander Loui

TL;DR
This paper presents an unsupervised coarse-to-fine framework for video object segmentation that combines tracking, supervoxel generation, and graph-based segmentation to improve accuracy and robustness.
Contribution
It introduces a novel unsupervised coarse-to-fine approach integrating tracking, supervoxels, and graph-based segmentation for improved video object extraction.
Findings
Outperforms previous methods in accuracy
Demonstrates robustness across various video sequences
Effective in extracting salient moving objects
Abstract
In this study, we develop an unsupervised coarse-to-fine video analysis framework and prototype system to extract a salient object in a video sequence. This framework starts from tracking grid-sampled points along temporal frames, typically using KLT tracking method. The tracking points could be divided into several groups due to their inconsistent movements. At the same time, the SLIC algorithm is extended into 3D space to generate supervoxels. Coarse segmentation is achieved by combining the categorized tracking points and supervoxels of the corresponding frame in the video sequence. Finally, a graph-based fine segmentation algorithm is used to extract the moving object in the scene. Experimental results reveal that this method outperforms the previous approaches in terms of accuracy and robustness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
