Selectively Dilated Convolution for Accuracy-Preserving Sparse   Pillar-based Embedded 3D Object Detection

Seongmin Park; Minjae Lee; Junwon Choi; Jungwook Choi

arXiv:2408.13798·cs.CV·August 27, 2024

Selectively Dilated Convolution for Accuracy-Preserving Sparse Pillar-based Embedded 3D Object Detection

Seongmin Park, Minjae Lee, Junwon Choi, Jungwook Choi

PDF

Open Access

TL;DR

This paper introduces a selectively dilated convolution method for sparse pillar-based 3D object detection that significantly reduces computation while maintaining high accuracy, leveraging sparsity-aware acceleration techniques.

Contribution

It proposes a novel selectively dilated convolution to improve accuracy in sparse pillar networks and a cost-efficient accelerator augmentation supporting this method.

Findings

01

Achieves up to 18.1x computational savings

02

Realizes 16.2x speedup on embedded accelerators

03

Maintains detection accuracy despite high sparsity

Abstract

Pillar-based 3D object detection has gained traction in self-driving technology due to its speed and accuracy facilitated by the artificial densification of pillars for GPU-friendly processing. However, dense pillar processing fundamentally wastes computation since it ignores the inherent sparsity of pillars derived from scattered point cloud data. Motivated by recent embedded accelerators with native sparsity support, sparse pillar convolution methods like submanifold convolution (SubM-Conv) aimed to reduce these redundant computations by applying convolution only on active pillars but suffered considerable accuracy loss. Our research identifies that this accuracy loss is due to the restricted fine-grained spatial information flow (fSIF) of SubM-Conv in sparse pillar networks. To overcome this restriction, we propose a selectively dilated (SD-Conv) convolution that evaluates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Advanced Neural Network Applications · Robotics and Sensor-Based Localization

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Submanifold Convolution · Convolution