SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network   Acceleration

Shurui Li; Wojciech Romaszkan; Alexander Graening; Puneet Gupta

arXiv:2103.01308·cs.LG·March 4, 2021

SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network Acceleration

Shurui Li, Wojciech Romaszkan, Alexander Graening, Puneet Gupta

PDF

Open Access

TL;DR

SWIS introduces a shared weight bit sparsity quantization framework that significantly enhances neural network inference efficiency and accuracy, enabling faster processing and better compression on commodity hardware.

Contribution

The paper proposes SWIS, a novel quantization method utilizing shared weight bit sparsity, with an offline decomposition and scheduling algorithm for improved neural network acceleration.

Findings

01

Achieves up to 54.3% accuracy improvement over weight truncation.

02

Provides up to 6x speedup and 1.9x energy efficiency over existing architectures.

03

Enables effective quantization of MobileNet-v2 to 2-4 bits.

Abstract

Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS can achieve up to 54.3% (19.8%) point accuracy improvement compared to weight truncation when quantizing MobileNet-v2 to 4 (2) bits post-training (with retraining) showing the strength of leveraging shared bit-sparsity in weights. SWIS accelerator gives up to 6x speedup and 1.9x energy improvement overstate of the art bit-serial architectures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Neural Networks and Applications