QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model   Co-Exploration

Ahmet Inci; Siri Garudanagiri Virupaksha; Aman Jain; Ting-Wu; Chin; Venkata Vivek Thallam; Ruizhou Ding; Diana Marculescu

arXiv:2206.15463·cs.AR·July 1, 2022

QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration

Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Ting-Wu, Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu

PDF

Open Access

TL;DR

QUIDAM is a comprehensive framework for exploring quantization-aware DNN accelerators, enabling rapid evaluation of various design choices to optimize performance, energy efficiency, and area, significantly accelerating the design process.

Contribution

The paper introduces QUIDAM, a highly parameterized, fast, and accurate co-exploration framework for quantization-aware DNN accelerator design, incorporating diverse design parameters and models.

Findings

01

Different bit precisions significantly affect performance per area and energy.

02

Lightweight processing elements can match accuracy while improving efficiency.

03

QUIDAM accelerates design exploration by 3-4 orders of magnitude.

Abstract

As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QUIDAM, a highly parameterized quantization-aware DNN accelerator and model co-exploration framework. Our framework can facilitate future research on design space exploration of DNN accelerators for various design choices such as bit precision, processing element type, scratchpad sizes of processing elements, global buffer size, number of total processing elements, and DNN configurations. Our results show that different bit precisions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Advanced Neural Network Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings