Multiply-and-Fire (MNF): An Event-driven Sparse Neural Network   Accelerator

Miao Yu; Tingting Xiang; Venkata Pavan Kumar Miriyala; Trevor E.; Carlson

arXiv:2204.09797·cs.AR·April 22, 2022

Multiply-and-Fire (MNF): An Event-driven Sparse Neural Network Accelerator

Miao Yu, Tingting Xiang, Venkata Pavan Kumar Miriyala, Trevor E., Carlson

PDF

Open Access

TL;DR

This paper introduces an event-driven sparse neural network accelerator that significantly improves energy efficiency and performance for AI inference by leveraging activation-based sparsity and a highly-parallel dataflow approach.

Contribution

It presents a novel event-driven acceleration method that enhances system efficiency and utilization for sparse neural network inference, outperforming existing solutions.

Findings

01

Achieves 1.46× energy efficiency improvement over state-of-the-art.

02

Demonstrates high performance at 30 fps for CNN and MLP workloads.

03

Introduces a highly-parallel dataflow method for better utilization.

Abstract

Machine learning, particularly deep neural network inference, has become a vital workload for many computing systems, from data centers and HPC systems to edge-based computing. As advances in sparsity have helped improve the efficiency of AI acceleration, there is a continued need for improved system efficiency for both high-performance and system-level acceleration. This work takes a unique look at sparsity with an event (or activation-driven) approach to ANN acceleration that aims to minimize useless work, improve utilization, and increase performance and energy efficiency. Our analytical and experimental results show that this event-driven solution presents a new direction to enable highly efficient AI inference for both CNN and MLP workloads. This work demonstrates state-of-the-art energy efficiency and performance centring on activation-based sparsity and a highly-parallel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Adversarial Robustness in Machine Learning