Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Aojun Zhou; Yukun Ma; Junnan Zhu; Jianbo Liu; Zhijie Zhang; Kun Yuan,; Wenxiu Sun; Hongsheng Li

arXiv:2102.04010·cs.CV·April 20, 2021·74 cites

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Aojun Zhou, Yukun Ma, Junnan Zhu, Jianbo Liu, Zhijie Zhang, Kun Yuan,, Wenxiu Sun, Hongsheng Li

PDF

Open Access 4 Repos 1 Video

TL;DR

This paper introduces a novel N:M structured sparsity training method for neural networks that combines the benefits of unstructured and structured sparsity, achieving significant speed-ups on GPUs without performance loss.

Contribution

First to train N:M structured sparse networks from scratch, proposing SR-STE to improve gradient approximation, and introducing SAD to analyze topology changes during training.

Findings

01

2x speed-up on Nvidia A100 GPUs with no performance drop

02

SR-STE outperforms vanilla STE in training sparse networks

03

SAD effectively measures topology changes during training

Abstract

Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. It can be generally categorized into unstructured fine-grained sparsity that zeroes out multiple individual weights distributed across the neural network, and structured coarse-grained sparsity which prunes blocks of sub-networks of a neural network. Fine-grained sparsity can achieve a high compression ratio but is not hardware friendly and hence receives limited speed gains. On the other hand, coarse-grained sparsity cannot concurrently achieve both apparent acceleration on modern GPUs and decent performance. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network, which can maintain the advantages of both unstructured fine-grained sparsity and structured coarse-grained sparsity simultaneously on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM