Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning
Davide Giacomini, Maeesha Binte Hashem, Jeremiah Suarez, Swarup, Bhunia, and Amit Ranjan Trivedi

TL;DR
This paper introduces a compute-free, memorization-based inference method for deep neural networks that uses lookup tables and in-memory computing, significantly reducing energy consumption and model size constraints.
Contribution
It presents a novel memorization-based inference approach leveraging lookup tables and in-memory circuits, enabling compute-free, model-size agnostic inference for deep learning.
Findings
Achieves 2.7x energy efficiency improvement over MLP-CIM
Achieves 83x energy efficiency improvement over ResNet20-CIM
Demonstrates effective inference on MNIST with reduced computation
Abstract
The rapid advancement of deep neural networks has significantly improved various tasks, such as image and speech recognition. However, as the complexity of these models increases, so does the computational cost and the number of parameters, making it difficult to deploy them on resource-constrained devices. This paper proposes a novel memorization-based inference (MBI) that is compute free and only requires lookups. Specifically, our work capitalizes on the inference mechanism of the recurrent attention model (RAM), where only a small window of input domain (glimpse) is processed in a one time step, and the outputs from multiple glimpses are combined through a hidden vector to determine the overall classification output of the problem. By leveraging the low-dimensionality of glimpse, our inference procedure stores key value pairs comprising of glimpse location, patch vector, etc. in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Advanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices
