Not All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective
Xiaokang Chen, Jiaxiang Tang, Jingbo Wang, Gang Zeng

TL;DR
This paper introduces a novel point-voxel aggregation network for semantic scene completion that reduces computation redundancy and improves accuracy by combining point cloud and voxel representations.
Contribution
It proposes a new point-voxel aggregation approach with an anisotropic operator and semantic-aware propagation, enhancing scene completion efficiency and quality.
Findings
Outperforms state-of-the-art methods on two benchmarks
Uses only depth images as input for scene completion
Reduces computation redundancy in deep networks
Abstract
We revisit Semantic Scene Completion (SSC), a useful task to predict the semantic and occupancy representation of 3D scenes, in this paper. A number of methods for this task are always based on voxelized scene representations for keeping local scene structure. However, due to the existence of visible empty voxels, these methods always suffer from heavy computation redundancy when the network goes deeper, and thus limit the completion quality. To address this dilemma, we propose our novel point-voxel aggregation network for this task. Firstly, we transfer the voxelized scenes to point clouds by removing these visible empty voxels and adopt a deep point stream to capture semantic information from the scene efficiently. Meanwhile, a light-weight voxel stream containing only two 3D convolution layers preserves local structures of the voxelized scenes. Furthermore, we design an anisotropic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
Methods3D Convolution · Convolution
