SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving
Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee,, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi

TL;DR
SPADE is a co-designed algorithm-hardware approach that exploits vector sparsity in pillar-based 3D object detection to significantly reduce computation, improve speed, and save energy for autonomous driving perception systems.
Contribution
It introduces a novel co-design strategy with dynamic vector pruning, specialized hardware, and optimized dataflow to effectively leverage sparsity in pillar encoding for 3D detection.
Findings
Saves 36.3-89.2% computation in detection networks
Achieves 1.3-10.9× speedup and 1.5-12.6× energy savings
Outperforms dense accelerators and platform baselines
Abstract
3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting stringent resource and latency requirements. PointPillars, a widely adopted bird's-eye view (BEV) encoding, aggregates 3D point cloud data into 2D pillars for fast and accurate 3D object detection. However, the state-of-the-art methods employing PointPillars overlook the inherent sparsity of pillar encoding where only a valid pillar is encoded with a vector of channel elements, missing opportunities for significant computational reduction. Meanwhile, current sparse convolution accelerators are designed to handle only element-wise activation sparsity and do not effectively address the vector sparsity imposed by pillar encoding. In this paper, we propose SPADE, an algorithm-hardware co-design strategy to maximize vector sparsity in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Visual Attention and Saliency Detection
MethodsPruning · Spatially-Adaptive Normalization · Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
