Multi-Resolution Alignment for Voxel Sparsity in Camera-Based 3D Semantic Scene Completion
Zhiwen Yang, Yuxin Peng

TL;DR
This paper introduces a Multi-Resolution Alignment approach to improve camera-based 3D semantic scene completion by addressing voxel sparsity through scene and instance level alignment across multi-resolution features.
Contribution
It proposes novel modules for multi-resolution feature alignment, semantic significance identification, and critical distribution alignment to enhance voxel-level scene understanding.
Findings
Improved accuracy in voxel occupancy and semantic labeling.
Enhanced model performance on autonomous driving datasets.
Effective mitigation of voxel sparsity issues.
Abstract
Camera-based 3D semantic scene completion (SSC) offers a cost-effective solution for assessing the geometric occupancy and semantic labels of each voxel in the surrounding 3D scene with image inputs, providing a voxel-level scene perception foundation for the perception-prediction-planning autonomous driving systems. Although significant progress has been made in existing methods, their optimization rely solely on the supervision from voxel labels and face the challenge of voxel sparsity as a large portion of voxels in autonomous driving scenarios are empty, which limits both optimization efficiency and model performance. To address this issue, we propose a \textit{Multi-Resolution Alignment (MRA)} approach to mitigate voxel sparsity in camera-based 3D semantic scene completion, which exploits the scene and instance level alignment across multi-resolution 3D features as auxiliary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization
