Putting 3D Spatially Sparse Networks on a Diet

Junha Lee; Christopher Choy; Jaesik Park

arXiv:2112.01316·cs.CV·April 11, 2022

Putting 3D Spatially Sparse Networks on a Diet

Junha Lee, Christopher Choy, Jaesik Park

PDF

Open Access

TL;DR

This paper investigates weight sparsity in 3D neural networks and introduces a compact, sparse 3D convolutional network that maintains performance while significantly reducing size and computational cost.

Contribution

It is the first comprehensive study on weight sparsity in spatially sparse 3D networks and proposes a novel WS^3-Convnet with high compression and efficiency.

Findings

01

Achieves 99% parameter reduction with only 2.15% performance drop

02

Reduces computational cost by 95%

03

Speeds up inference by 45%

Abstract

3D neural networks have become prevalent for many 3D vision tasks including object detection, segmentation, registration, and various perception tasks for 3D inputs. However, due to the sparsity and irregularity of 3D data, custom 3D operators or network designs have been the primary focus of research, while the size of networks or efficacy of parameters has been overlooked. In this work, we perform the first comprehensive study on the weight sparsity of spatially sparse 3D convolutional networks and propose a compact weight-sparse and spatially sparse 3D convnet (WS^3-Convnet) for semantic and instance segmentation on the real-world indoor and outdoor datasets. We employ various network pruning strategies to find compact networks and show our WS^3-Convnet achieves minimal loss in performance (2.15\% drop) with orders-of-magnitude smaller number of parameters (99\% compression rate) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Advanced Neural Network Applications

MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings