QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality
Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek, Thallam, Ruizhou Ding, Diana Marculescu

TL;DR
QADAM is a flexible modeling framework that incorporates quantization effects into DNN accelerator design, enabling efficient exploration of design trade-offs and identifying Pareto-optimal configurations for energy and performance.
Contribution
The paper introduces QADAM, a comprehensive quantization-aware modeling framework for DNN accelerators that supports design space exploration with accurate power, performance, and area estimations.
Findings
Different quantization levels significantly affect performance and energy efficiency.
Lightweight processing elements (LightPEs) achieve Pareto-optimal results.
LightPEs outperform INT16 designs by up to 5.7x in performance per area and energy.
Abstract
As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied bit precision or quantization levels, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements (PE) into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QADAM, a highly parameterized quantization-aware power, performance, and area modeling framework for DNN accelerators. Our framework can facilitate future research on design space exploration and Pareto-efficiency of DNN accelerators for various design choices such as bit precision, PE type, scratchpad sizes of PEs, global buffer size, number of total PEs, and DNN configurations. Our results show that different bit precisions and PE types lead to significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Advanced Neural Network Applications
