RAMAN: A Re-configurable and Sparse tinyML Accelerator for Inference on Edge
Adithya Krishna, Srikanth Rohit Nudurupati, Chandana D G, Pritesh, Dwivedi, Andr\'e van Schaik, Mahesh Mehendale, Chetan Singh Thakur

TL;DR
RAMAN is a reconfigurable, sparse accelerator designed for edge inference of DNNs, reducing area, power, and latency by exploiting sparsity and employing an innovative dataflow, suitable for various models and accuracy-power tradeoffs.
Contribution
It introduces RAMAN, a novel reconfigurable sparse accelerator architecture that leverages a Gustavson-inspired dataflow and memory overlap techniques for efficient edge DNN inference.
Findings
Processes MobileNetV1 at 98.47 GOp/s/W
Achieves 79.68 GOp/s/W on DS-CNN
Reduces storage by up to 50% through memory overlap
Abstract
Deep Neural Network (DNN) based inference at the edge is challenging as these compute and data-intensive algorithms need to be implemented at low cost and low power while meeting the latency constraints of the target applications. Sparsity, in both activations and weights inherent to DNNs, is a key knob to leverage. In this paper, we present RAMAN, a Re-configurable and spArse tinyML Accelerator for infereNce on edge, architected to exploit the sparsity to reduce area (storage), power as well as latency. RAMAN can be configured to support a wide range of DNN topologies - consisting of different convolution layer types and a range of layer parameters (feature-map size and the number of channels). RAMAN can also be configured to support accuracy vs power/latency tradeoffs using techniques deployed at compile-time and run-time. We present the salient features of the architecture, provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning in Materials Science · Brain Tumor Detection and Classification
MethodsDepthwise Convolution · Pointwise Convolution · Dense Connections · Depthwise Separable Convolution · Batch Normalization · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Softmax · Convolution
