A Theory of I/O-Efficient Sparse Neural Network Inference

Niels Gleinig; Tal Ben-Nun; Torsten Hoefler

arXiv:2301.01048·cs.DC·January 4, 2023

A Theory of I/O-Efficient Sparse Neural Network Inference

Niels Gleinig, Tal Ben-Nun, Torsten Hoefler

PDF

Open Access

TL;DR

This paper provides a theoretical framework for analyzing and optimizing the I/O complexity of sparse neural network inference, leading to significant speedups on real hardware.

Contribution

It establishes bounds on I/O operations for sparse neural networks and introduces algorithms to approach optimal I/O efficiency, including instance-specific sparsity considerations.

Findings

01

Theoretical bounds on I/O complexity are within a factor of 2.

02

Algorithms achieve near-optimal I/O performance.

03

Empirical speedups of up to 45x on real hardware.

Abstract

As the accuracy of machine learning models increases at a fast rate, so does their demand for energy and compute resources. On a low level, the major part of these resources is consumed by data movement between different memory units. Modern hardware architectures contain a form of fast memory (e.g., cache, registers), which is small, and a slow memory (e.g., DRAM), which is larger but expensive to access. We can only process data that is stored in fast memory, which incurs data movement (input/output-operations, or I/Os) between the two units. In this paper, we provide a rigorous theoretical analysis of the I/Os needed in sparse feedforward neural network (FFNN) inference. We establish bounds that determine the optimal number of I/Os up to a factor of 2 and present a method that uses a number of I/Os within that range. Much of the I/O-complexity is determined by a few high-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Neural Networks and Applications

MethodsTest