Inference, Learning and Attention Mechanisms that Exploit and Preserve   Sparsity in Convolutional Networks

Timo Hackel; Mikhail Usvyatsov; Silvano Galliani; Jan D. Wegner,; Konrad Schindler

arXiv:1801.10585·cs.CV·March 13, 2020

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in Convolutional Networks

Timo Hackel, Mikhail Usvyatsov, Silvano Galliani, Jan D. Wegner,, Konrad Schindler

PDF

1 Repo

TL;DR

This paper introduces methods to efficiently process sparse data in convolutional neural networks by exploiting sparsity in feature maps and weights, reducing memory and computation costs while maintaining learning capabilities.

Contribution

The authors present a suite of tools including a sparse convolution implementation, an attention mechanism to prevent fill-in, and an adapted back-propagation algorithm for sparse CNNs.

Findings

01

Significantly lower memory usage and computation times on sparse data.

02

Effective prevention of fill-in during convolution, maintaining sparsity.

03

Compatibility with standard learning frameworks for training sparse CNNs.

Abstract

While CNNs naturally lend themselves to densely sampled data, and sophisticated implementations are available, they lack the ability to efficiently process sparse data. In this work we introduce a suite of tools that exploit sparsity in both the feature maps and the filter weights, and thereby allow for significantly lower memory footprints and computation times than the conventional dense framework when processing data with a high degree of sparsity. Our scheme provides (i) an efficient GPU implementation of a convolution layer based on direct, sparse convolution; (ii) a filter step within the convolution layer, which we call attention, that prevents fill-in, i.e., the tendency of convolution to rapidly decrease sparsity, and guarantees an upper bound on the computational resources; and (iii) an adaptation of the back-propagation algorithm, which makes it possible to combine our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TimoHackel/ILA-SCNN
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution