Optimized CNNs for Rapid 3D Point Cloud Object Recognition

Tianyi Lyu; Dian Gu; Peiyuan Chen; Yaoting Jiang; Zhenhong Zhang,; Huadong Pang; Li Zhou; Yiping Dong

arXiv:2412.02855·cs.CV·December 5, 2024·2 cites

Optimized CNNs for Rapid 3D Point Cloud Object Recognition

Tianyi Lyu, Dian Gu, Peiyuan Chen, Yaoting Jiang, Zhenhong Zhang,, Huadong Pang, Li Zhou, Yiping Dong

PDF

Open Access

TL;DR

This paper presents a novel sparse CNN architecture with an $ abla$1 regularization technique for efficient 3D object detection in point clouds, achieving superior accuracy and speed on the MVTec 3D-AD benchmark.

Contribution

It introduces a sparse convolutional layer design with $ abla$1 regularization, improving 3D point cloud object recognition efficiency and accuracy over prior methods.

Findings

01

Outperforms previous state-of-the-art in 3D object detection

02

Maintains competitive processing speeds for real-time use

03

Demonstrates effectiveness on the MVTec 3D-AD benchmark

Abstract

This study introduces a method for efficiently detecting objects within 3D point clouds using convolutional neural networks (CNNs). Our approach adopts a unique feature-centric voting mechanism to construct convolutional layers that capitalize on the typical sparsity observed in input data. We explore the trade-off between accuracy and speed across diverse network architectures and advocate for integrating an $L_{1}$ penalty on filter activations to augment sparsity within intermediate layers. This research pioneers the proposal of sparse convolutional layers combined with $L_{1}$ regularization to effectively handle large-scale 3D data processing. Our method's efficacy is demonstrated on the MVTec 3D-AD object detection benchmark. The Vote3Deep models, with just three layers, outperform the previous state-of-the-art in both laser-only approaches and combined…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · 3D Surveying and Cultural Heritage · Remote Sensing and LiDAR Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Feature-Centric Voting