Sensor Fusion by Spatial Encoding for Autonomous Driving

Quoc-Vinh Lai-Dang; Jihui Lee; Bumgeun Park; Dongsoo Har

arXiv:2308.10707·cs.CV·August 22, 2023

Sensor Fusion by Spatial Encoding for Autonomous Driving

Quoc-Vinh Lai-Dang, Jihui Lee, Bumgeun Park, Dongsoo Har

PDF

Open Access

TL;DR

This paper presents a novel sensor fusion method using spatial encoding with Transformers for autonomous driving, effectively combining camera and LiDAR data to improve perception accuracy in complex environments.

Contribution

Introduces a Transformer-based sensor fusion approach with multi-resolution modules for enhanced local and global contextual integration in autonomous driving.

Findings

01

Outperforms previous methods on challenging benchmarks

02

Achieves 8% and 19% higher driving scores on specific datasets

03

Demonstrates robustness in high-density traffic scenarios

Abstract

Sensor fusion is critical to perception systems for task domains such as autonomous driving and robotics. Recently, the Transformer integrated with CNN has demonstrated high performance in sensor fusion for various perception tasks. In this work, we introduce a method for fusing data from camera and LiDAR. By employing Transformer modules at multiple resolutions, proposed method effectively combines local and global contextual relationships. The performance of the proposed method is validated by extensive experiments with two adversarial benchmarks with lengthy routes and high-density traffics. The proposed method outperforms previous approaches with the most challenging benchmarks, achieving significantly higher driving and infraction scores. Compared with TransFuser, it achieves 8% and 19% improvement in driving scores for the Longest6 and Town05 Long benchmarks, respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optical Sensing Technologies · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Softmax · Dense Connections