A Point-Based Approach to Efficient LiDAR Multi-Task Perception

Christopher Lang; Alexander Braun; Lars Schillingmann; Abhinav Valada

arXiv:2404.12798·cs.CV·April 22, 2024

A Point-Based Approach to Efficient LiDAR Multi-Task Perception

Christopher Lang, Alexander Braun, Lars Schillingmann, Abhinav Valada

PDF

Open Access

TL;DR

PAttFormer is a novel point-based multi-task architecture for LiDAR perception that is smaller, faster, and maintains competitive accuracy for semantic segmentation and object detection.

Contribution

It introduces a transformer-based, point-only architecture that eliminates the need for multiple task-specific encoders, improving efficiency in LiDAR multi-task perception.

Findings

01

3x smaller network size compared to previous methods

02

1.4x faster inference speed

03

Improved segmentation and detection accuracy on nuScenes

Abstract

Multi-task networks can potentially improve performance and computational efficiency compared to single-task networks, facilitating online deployment. However, current multi-task architectures in point cloud perception combine multiple task-specific point cloud representations, each requiring a separate feature encoder and making the network structures bulky and slow. We propose PAttFormer, an efficient multi-task architecture for joint semantic segmentation and object detection in point clouds that only relies on a point-based representation. The network builds on transformer-based feature encoders using neighborhood attention and grid-pooling and a query-based detection decoder using a novel 3D deformable-attention detection head design. Unlike other LiDAR-based multi-task architectures, our proposed PAttFormer does not require separate feature encoders for multiple task-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Vision and Imaging

MethodsNeighborhood Attention