QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration
Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Ting-Wu, Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

TL;DR
QUIDAM is a comprehensive framework for exploring quantization-aware DNN accelerators, enabling rapid evaluation of various design choices to optimize performance, energy efficiency, and area, significantly accelerating the design process.
Contribution
The paper introduces QUIDAM, a highly parameterized, fast, and accurate co-exploration framework for quantization-aware DNN accelerator design, incorporating diverse design parameters and models.
Findings
Different bit precisions significantly affect performance per area and energy.
Lightweight processing elements can match accuracy while improving efficiency.
QUIDAM accelerates design exploration by 3-4 orders of magnitude.
Abstract
As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QUIDAM, a highly parameterized quantization-aware DNN accelerator and model co-exploration framework. Our framework can facilitate future research on design space exploration of DNN accelerators for various design choices such as bit precision, processing element type, scratchpad sizes of processing elements, global buffer size, number of total processing elements, and DNN configurations. Our results show that different bit precisions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
