Revealing Occlusions with 4D Neural Fields
Basile Van Hoorick, Purva Tendulkar, Didac Suris, Dennis Park, Simon, Stent, Carl Vondrick

TL;DR
This paper presents a novel 4D neural field framework that encodes continuous point cloud representations from monocular RGB-D data, enabling the system to persist and reason about occluded objects in dynamic scenes.
Contribution
It introduces a continuous 4D neural representation that attends across spatiotemporal context to reveal occluded objects without architectural modifications.
Findings
Successfully reveals occlusions in large video datasets
Attention mechanism learns to track occluded objects automatically
End-to-end trainable and adaptable to various video understanding tasks
Abstract
For computer vision systems to operate in dynamic situations, they need to be able to represent and reason about object permanence. We introduce a framework for learning to estimate 4D visual representations from monocular RGB-D, which is able to persist objects, even once they become obstructed by occlusions. Unlike traditional video representations, we encode point clouds into a continuous representation, which permits the model to attend across the spatiotemporal context to resolve occlusions. On two large video datasets that we release along with this paper, our experiments show that the representation is able to successfully reveal occlusions for several tasks, without any architectural changes. Visualizations show that the attention mechanism automatically learns to follow occluded objects. Since our approach can be trained end-to-end and is easily adaptable, we believe it will be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Advanced Vision and Imaging
