Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation
Benjamin Drayer, Thomas Brox

TL;DR
This paper introduces a method that combines object detection, tracking, and motion cues to achieve accurate, temporally consistent object segmentation in videos, addressing common challenges like camera motion and static scenes.
Contribution
It presents a novel approach that integrates detection and motion information to improve object segmentation accuracy and consistency in videos.
Findings
Effective in scenes with no motion or camera movement
Achieves high accuracy on multiple datasets
Overcomes limitations of weakly supervised segmentation
Abstract
We present an approach for object segmentation in videos that combines frame-level object detection with concepts from object tracking and motion segmentation. The approach extracts temporally consistent object tubes based on an off-the-shelf detector. Besides the class label for each tube, this provides a location prior that is independent of motion. For the final video segmentation, we combine this information with motion cues. The method overcomes the typical problems of weakly supervised/unsupervised video segmentation, such as scenes with no motion, dominant camera motion, and objects that move as a unit. In contrast to most tracking methods, it provides an accurate, temporally consistent segmentation of each object. We report results on four video segmentation datasets: YouTube Objects, SegTrackv2, egoMotion, and FBMS.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
