FAT: An In-Memory Accelerator with Fast Addition for Ternary Weight   Neural Networks

Shien Zhu; Luan H.K. Duong; Hui Chen; Di Liu; Weichen Liu

arXiv:2201.07634·cs.AR·August 3, 2022

FAT: An In-Memory Accelerator with Fast Addition for Ternary Weight Neural Networks

Shien Zhu, Luan H.K. Duong, Hui Chen, Di Liu, Weichen Liu

PDF

TL;DR

This paper introduces FAT, an in-memory accelerator optimized for ternary weight neural networks, leveraging sparsity and fast addition techniques to significantly improve speed and energy efficiency over existing solutions.

Contribution

FAT presents novel hardware techniques including a Sparse Addition Control Unit and a memory-based fast addition scheme specifically designed for TWNs, enhancing IMC acceleration.

Findings

01

FAT achieves 2.00X speedup in addition operations.

02

FAT improves power and area efficiency by 22%.

03

FAT outperforms ParaPIM with 10X speedup and 12X energy efficiency on sparse networks.

Abstract

Convolutional Neural Networks (CNNs) demonstrate excellent performance in various applications but have high computational complexity. Quantization is applied to reduce the latency and storage cost of CNNs. Among the quantization methods, Binary and Ternary Weight Networks (BWNs and TWNs) have a unique advantage over 8-bit and 4-bit quantization. They replace the multiplication operations in CNNs with additions, which are favoured on In-Memory-Computing (IMC) devices. IMC acceleration for BWNs has been widely studied. However, though TWNs have higher accuracy and better sparsity than BWNs, IMC acceleration for TWNs has limited research. TWNs on existing IMC devices are inefficient because the sparsity is not well utilized, and the addition operation is not efficient. In this paper, we propose FAT as a novel IMC accelerator for TWNs. First, we propose a Sparse Addition Control Unit,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.