Fundamental Limits on Energy-Delay-Accuracy of In-memory Architectures in Inference Applications
Sujan Kumar Gonugondla, Charbel Sakr, Hassan Dbouk, Naresh R. Shanbhag

TL;DR
This paper establishes fundamental limits on the precision, energy, and delay of in-memory computing architectures (IMCs) for inference, analyzing noise models and SNR metrics to optimize accuracy and efficiency.
Contribution
It introduces a noise model and SNR analysis framework for IMCs, proposes the minimum precision criterion (MPC), and compares three compute models to optimize energy-delay-accuracy trade-offs.
Findings
IMCs have an upper bound on SNR due to energy, area, and voltage constraints.
MPC enables near-ideal SNR with minimal ADC precision.
QS-based architectures are optimal for low SNR, QR-based for high SNR scenarios.
Abstract
This paper obtains fundamental limits on the computational precision of in-memory computing architectures (IMCs). An IMC noise model and associated SNR metrics are defined and their interrelationships analyzed to show that the accuracy of IMCs is fundamentally limited by the compute SNR () of its analog core, and that activation, weight and output precision needs to be assigned appropriately for the final output SNR . The minimum precision criterion (MPC) is proposed to minimize the ADC precision. Three in-memory compute models - charge summing (QS), current summing (IS) and charge redistribution (QR) - are shown to underlie most known IMCs. Noise, energy and delay expressions for the compute models are developed and employed to derive expressions for the SNR, ADC precision, energy, and latency of IMCs. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
