Truly Sparse Neural Networks at Scale

Selima Curci; Decebal Constantin Mocanu; Mykola Pechenizkiyi

arXiv:2102.01732·cs.LG·July 13, 2022

Truly Sparse Neural Networks at Scale

Selima Curci, Decebal Constantin Mocanu, Mykola Pechenizkiyi

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that truly sparse neural networks can be trained at scale with novel methods, achieving state-of-the-art performance and enabling more efficient AI.

Contribution

The paper introduces a parallel training algorithm, a specialized activation function, and a neuron importance metric to effectively train fully sparse neural networks.

Findings

01

Achieved the largest neural network in terms of representational power.

02

State-of-the-art performance with truly sparse networks.

03

Enabled environmentally friendly AI through efficiency improvements.

Abstract

Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approach, and we show that we can train truly sparse neural networks to harvest their full potential. To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, (2) an activation function with non-trainable parameters to favour the gradient flow, and (3) a hidden neurons importance metric to eliminate redundancies. All in one, we are able to break the record and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SelimaC/large-scale-sparse-neural-networks
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Machine Learning and ELM