Training for X-Ray Vision: Amodal Segmentation, Amodal Content Completion, and View-Invariant Object Representation from Multi-Camera Video

Alexander Moore; Amar Saini; Kylie Cancilla; Doug Poland; Carmen Carrano

arXiv:2507.00339·cs.CV·July 2, 2025

Training for X-Ray Vision: Amodal Segmentation, Amodal Content Completion, and View-Invariant Object Representation from Multi-Camera Video

Alexander Moore, Amar Saini, Kylie Cancilla, Doug Poland, Carmen Carrano

PDF

Open Access 1 Datasets

TL;DR

This paper introduces MOVi-MC-AC, a large multi-camera dataset for amodal segmentation and content completion, enabling better understanding of occluded objects in complex scenes with multiple viewpoints.

Contribution

It presents the first multi-camera amodal dataset with ground-truth content, advancing research in object detection, tracking, and occlusion reasoning in multi-view video.

Findings

01

Largest amodal dataset with 5.8 million object instances

02

First dataset providing ground-truth amodal content

03

Demonstrates the utility of multi-camera views for occlusion understanding

Abstract

Amodal segmentation and amodal content completion require using object priors to estimate occluded masks and features of objects in complex scenes. Until now, no data has provided an additional dimension for object context: the possibility of multiple cameras sharing a view of a scene. We introduce MOVi-MC-AC: Multiple Object Video with Multi-Cameras and Amodal Content, the largest amodal segmentation and first amodal content dataset to date. Cluttered scenes of generic household objects are simulated in multi-camera video. MOVi-MC-AC contributes to the growing literature of object detection, tracking, and segmentation by including two new contributions to the deep learning for computer vision world. Multiple Camera (MC) settings where objects can be identified and tracked between various unique camera perspectives are rare in both synthetic and real-world video. We introduce a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Amar-S/MOVi-MC-AC
dataset· 408 dl
408 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Face recognition and analysis