Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation
Ke Fan, Jingshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He,, Zheng Zhang, Yanwei Fu

TL;DR
This paper introduces EoRaS, a novel object-centric approach for video amodal segmentation that leverages supervised signals, 3D projections, and multi-view fusion to accurately predict full object masks in complex scenarios.
Contribution
The paper proposes EoRaS, combining object-centric supervision, BEV projection, and multi-view attention to improve amodal segmentation beyond motion flow reliance.
Findings
Achieves state-of-the-art results on real-world and synthetic benchmarks.
Effectively handles camera motion and object deformation challenges.
Demonstrates the benefit of 3D and multi-view features in segmentation accuracy.
Abstract
Video amodal segmentation is a particularly challenging task in computer vision, which requires to deduce the full shape of an object from the visible parts of it. Recently, some studies have achieved promising performance by using motion flow to integrate information across frames under a self-supervised setting. However, motion flow has a clear limitation by the two factors of moving cameras and object deformation. This paper presents a rethinking to previous works. We particularly leverage the supervised signals with object-centric representation in \textit{real-world scenarios}. The underlying idea is the supervision signal of the specific object and the features from different views can mutually benefit the deduction of the full mask in any specific frame. We thus propose an Efficient object-centric Representation amodal Segmentation (EoRaS). Specially, beyond solely relying on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods
