Machine Learning aided Computer Architecture Design for CNN Inferencing Systems
Christopher A. Metz

TL;DR
This paper introduces a rapid, accurate method for predicting power and performance of CNN inference on GPGPUs, streamlining the design space exploration process for efficient hardware selection in ML systems.
Contribution
It presents a novel technique for fast power and performance forecasting of CNNs on GPGPUs, reducing manual effort and development time.
Findings
MAPE of 5.03% for power prediction
MAPE of 5.94% for performance prediction
Enables early-stage estimation to save time and costs
Abstract
Efficient and timely calculations of Machine Learning (ML) algorithms are essential for emerging technologies like autonomous driving, the Internet of Things (IoT), and edge computing. One of the primary ML algorithms used in such systems is Convolutional Neural Networks (CNNs), which demand high computational resources. This requirement has led to the use of ML accelerators like GPGPUs to meet design constraints. However, selecting the most suitable accelerator involves Design Space Exploration (DSE), a process that is usually time-consuming and requires significant manual effort. Our work presents approaches to expedite the DSE process by identifying the most appropriate GPGPU for CNN inferencing systems. We have developed a quick and precise technique for forecasting the power and performance of CNNs during inference, with a MAPE of 5.03% and 5.94%, respectively. Our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
