Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A., Horowitz, William J. Dally

TL;DR
EIE is a pioneering hardware accelerator that significantly improves the efficiency of pruned and compressed neural networks by leveraging sparsity and low-precision techniques, influencing subsequent AI hardware designs.
Contribution
This paper provides a retrospective review of EIE, highlighting its impact, strengths, weaknesses, and future opportunities in accelerating sparse and compressed neural networks.
Findings
EIE demonstrated substantial speedups for pruned neural networks.
It influenced numerous subsequent hardware and algorithm designs.
The review identifies new opportunities in sparsity and low-precision for emerging workloads.
Abstract
EIE proposed to accelerate pruned and compressed neural networks, exploiting weight sparsity, activation sparsity, and 4-bit weight-sharing in neural network accelerators. Since published in ISCA'16, it opened a new design space to accelerate pruned and sparse neural networks and spawned many algorithm-hardware co-designs for model compression and acceleration, both in academia and commercial AI chips. In retrospect, we review the background of this project, summarize the pros and cons, and discuss new opportunities where pruning, sparsity, and low precision can accelerate emerging deep learning workloads.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Brain Tumor Detection and Classification
