Normal Transformer: Extracting Surface Geometry from LiDAR Points   Enhanced by Visual Semantics

Ancheng Lin; Jun Li; Yusheng Xiang; Wei Bian; Mukesh Prasad

arXiv:2211.10580·cs.CV·February 13, 2025·1 cites

Normal Transformer: Extracting Surface Geometry from LiDAR Points Enhanced by Visual Semantics

Ancheng Lin, Jun Li, Yusheng Xiang, Wei Bian, Mukesh Prasad

PDF

Open Access

TL;DR

This paper introduces a transformer-based multi-modal neural network that fuses LiDAR and camera data to accurately estimate surface normals in autonomous driving scenarios, improving upon existing methods especially in sparse, noisy conditions.

Contribution

The paper proposes the Hybrid Geometric Transformer (HGT), a novel architecture that effectively combines visual and geometric data for surface normal estimation in challenging real-world LiDAR scans.

Findings

01

HGT outperforms existing normal estimation methods.

02

The model successfully transfers learned knowledge from simulated to real-world data.

03

Enhanced normal estimation improves downstream autonomous driving tasks.

Abstract

High-quality surface normal can help improve geometry estimation in problems faced by autonomous vehicles, such as collision avoidance and occlusion inference. While a considerable volume of literature focuses on densely scanned indoor scenarios, normal estimation during autonomous driving remains an intricate problem due to the sparse, non-uniform, and noisy nature of real-world LiDAR scans. In this paper, we introduce a multi-modal technique that leverages 3D point clouds and 2D colour images obtained from LiDAR and camera sensors for surface normal estimation. We present the Hybrid Geometric Transformer (HGT), a novel transformer-based neural network architecture that proficiently fuses visual semantic and 3D geometric information. Furthermore, we developed an effective learning strategy for the multi-modal data. Experimental results demonstrate the superior effectiveness of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging