SPADE: Sparse Pillar-based 3D Object Detection Accelerator for   Autonomous Driving

Minjae Lee; Seongmin Park; Hyungmin Kim; Minyong Yoon; Janghwan Lee,; Jun Won Choi; Nam Sung Kim; Mingu Kang; Jungwook Choi

arXiv:2305.07522·cs.AR·January 17, 2024·2 cites

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving

Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee,, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi

PDF

Open Access

TL;DR

SPADE is a co-designed algorithm-hardware approach that exploits vector sparsity in pillar-based 3D object detection to significantly reduce computation, improve speed, and save energy for autonomous driving perception systems.

Contribution

It introduces a novel co-design strategy with dynamic vector pruning, specialized hardware, and optimized dataflow to effectively leverage sparsity in pillar encoding for 3D detection.

Findings

01

Saves 36.3-89.2% computation in detection networks

02

Achieves 1.3-10.9× speedup and 1.5-12.6× energy savings

03

Outperforms dense accelerators and platform baselines

Abstract

3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting stringent resource and latency requirements. PointPillars, a widely adopted bird's-eye view (BEV) encoding, aggregates 3D point cloud data into 2D pillars for fast and accurate 3D object detection. However, the state-of-the-art methods employing PointPillars overlook the inherent sparsity of pillar encoding where only a valid pillar is encoded with a vector of channel elements, missing opportunities for significant computational reduction. Meanwhile, current sparse convolution accelerators are designed to handle only element-wise activation sparsity and do not effectively address the vector sparsity imposed by pillar encoding. In this paper, we propose SPADE, an algorithm-hardware co-design strategy to maximize vector sparsity in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Visual Attention and Saliency Detection

MethodsPruning · Spatially-Adaptive Normalization · Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings