Object Discovery in Videos as Foreground Motion Clustering
Christopher Xie, Yu Xiang, Zaid Harchaoui, Dieter Fox

TL;DR
This paper introduces a novel neural network approach that clusters foreground pixel trajectories in videos to improve object discovery and segmentation accuracy, achieving state-of-the-art results on standard datasets.
Contribution
A new pixel-trajectory recurrent neural network that learns feature embeddings for foreground motion clustering in videos, enabling improved object discovery.
Findings
Achieves state-of-the-art performance on motion segmentation datasets.
Effectively links foreground object masks across video frames.
Demonstrates the effectiveness of trajectory-based clustering for object discovery.
Abstract
We consider the problem of providing dense segmentation masks for object discovery in videos. We formulate the object discovery problem as foreground motion clustering, where the goal is to cluster foreground pixels in videos into different objects. We introduce a novel pixel-trajectory recurrent neural network that learns feature embeddings of foreground pixel trajectories linked across time. By clustering the pixel trajectories using the learned feature embeddings, our method establishes correspondences between foreground object masks across video frames. To demonstrate the effectiveness of our framework for object discovery, we conduct experiments on commonly used datasets for motion segmentation, where we achieve state-of-the-art performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Advanced Vision and Imaging
