Adversarial Semantic Scene Completion from a Single Depth Image
Yida Wang, David Joseph Tan, Nassir Navab, Federico Tombari

TL;DR
This paper introduces an adversarial learning-based method for reconstructing, completing, and semantically labeling 3D scenes from a single depth image, emphasizing realistic outputs and internal feature embedding.
Contribution
It presents a novel architecture that combines adversarial loss with feature correlation and retains 2.5D structure during testing for improved semantic scene completion.
Findings
Achieves state-of-the-art accuracy on benchmark datasets.
Effectively correlates encoder features with a variational auto-encoder.
Improves semantic scene completion quality both qualitatively and quantitatively.
Abstract
We propose a method to reconstruct, complete and semantically label a 3D scene from a single input depth image. We improve the accuracy of the regressed semantic 3D maps by a novel architecture based on adversarial learning. In particular, we suggest using multiple adversarial loss terms that not only enforce realistic outputs with respect to the ground truth, but also an effective embedding of the internal features. This is done by correlating the latent features of the encoder working on partial 2.5D data with the latent features extracted from a variational 3D auto-encoder trained to reconstruct the complete semantic scene. In addition, differently from other approaches that operate entirely through 3D convolutions, at test time we retain the original 2.5D structure of the input during downsampling to improve the effectiveness of the internal representation of our model. We test our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
