Fundamental Limits on Energy-Delay-Accuracy of In-memory Architectures   in Inference Applications

Sujan Kumar Gonugondla; Charbel Sakr; Hassan Dbouk; Naresh R. Shanbhag

arXiv:2012.13645·cs.AR·December 29, 2020

Fundamental Limits on Energy-Delay-Accuracy of In-memory Architectures in Inference Applications

Sujan Kumar Gonugondla, Charbel Sakr, Hassan Dbouk, Naresh R. Shanbhag

PDF

TL;DR

This paper establishes fundamental limits on the precision, energy, and delay of in-memory computing architectures (IMCs) for inference, analyzing noise models and SNR metrics to optimize accuracy and efficiency.

Contribution

It introduces a noise model and SNR analysis framework for IMCs, proposes the minimum precision criterion (MPC), and compares three compute models to optimize energy-delay-accuracy trade-offs.

Findings

01

IMCs have an upper bound on SNR due to energy, area, and voltage constraints.

02

MPC enables near-ideal SNR with minimal ADC precision.

03

QS-based architectures are optimal for low SNR, QR-based for high SNR scenarios.

Abstract

This paper obtains fundamental limits on the computational precision of in-memory computing architectures (IMCs). An IMC noise model and associated SNR metrics are defined and their interrelationships analyzed to show that the accuracy of IMCs is fundamentally limited by the compute SNR ( $SNR_{a}$ ) of its analog core, and that activation, weight and output precision needs to be assigned appropriately for the final output SNR $SNR_{T} \to SNR_{a}$ . The minimum precision criterion (MPC) is proposed to minimize the ADC precision. Three in-memory compute models - charge summing (QS), current summing (IS) and charge redistribution (QR) - are shown to underlie most known IMCs. Noise, energy and delay expressions for the compute models are developed and employed to derive expressions for the SNR, ADC precision, energy, and latency of IMCs. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.