Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene Understanding
Saqib Ali Khan, Yilei Shi, Muhammad Shahzad, Xiao Xiang Zhu

TL;DR
This paper introduces a novel graph-based encoding method for 3D point cloud segmentation that preserves spatial geometry, outperforming traditional CNN approaches in accuracy and stability on benchmark datasets.
Contribution
It proposes a new approach combining graph encodings with CNN features for improved 3D scene understanding, addressing CNN limitations in spatial information modeling.
Findings
Achieves state-of-the-art accuracy on benchmark datasets
Improves training time and model stability
Effectively models 3D geometry in point clouds
Abstract
Semantic segmentation of raw 3D point clouds is an essential component in 3D scene analysis, but it poses several challenges, primarily due to the non-Euclidean nature of 3D point clouds. Although, several deep learning based approaches have been proposed to address this task, but almost all of them emphasized on using the latent (global) feature representations from traditional convolutional neural networks (CNN), resulting in severe loss of spatial information, thus failing to model the geometry of the underlying 3D objects, that plays an important role in remote sensing 3D scenes. In this letter, we have proposed an alternative approach to overcome the limitations of CNN based approaches by encoding the spatial features of raw 3D point clouds into undirected symmetrical graph models. These encodings are then combined with a high-dimensional feature vector extracted from a traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Human Pose and Action Recognition
MethodsConvolution
