Context and Geometry Aware Voxel Transformer for Semantic Scene Completion
Zhu Yu, Runmin Zhang, Jiacheng Ying, Junchen Yu, Xiaohai Hu, Lun Luo,, Si-Yuan Cao, Hui-Liang Shen

TL;DR
This paper introduces CGFormer, a novel voxel transformer that incorporates context and geometry awareness for improved semantic scene completion, achieving state-of-the-art results on benchmark datasets.
Contribution
It proposes a context and geometry aware voxel transformer with a context-aware query generator and 3D deformable cross-attention, enhancing semantic and geometric understanding in SSC.
Findings
Achieves state-of-the-art mIoU of 16.87 on SemanticKITTI.
Attains 20.05 mIoU on SSCBench-KITTI-360.
Outperforms methods using temporal images or larger backbones.
Abstract
Vision-based Semantic Scene Completion (SSC) has gained much attention due to its widespread applications in various 3D perception tasks. Existing sparse-to-dense approaches typically employ shared context-independent queries across various input images, which fails to capture distinctions among them as the focal regions of different inputs vary and may result in undirected feature aggregation of cross-attention. Additionally, the absence of depth information may lead to points projected onto the image plane sharing the same 2D position or similar sampling points in the feature map, resulting in depth ambiguity. In this paper, we present a novel context and geometry aware voxel transformer. It utilizes a context aware query generator to initialize context-dependent queries tailored to individual input images, effectively capturing their unique characteristics and aggregating information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMusic and Audio Processing · Video Analysis and Summarization · Time Series Analysis and Forecasting
MethodsAttentive Walk-Aggregating Graph Neural Network
