DLAS: An Exploration and Assessment of the Deep Learning Acceleration   Stack

Perry Gibson; Jos\'e Cano; Elliot J. Crowley; Amos Storkey; Michael; O'Boyle

arXiv:2311.08909·cs.LG·November 16, 2023·2 cites

DLAS: An Exploration and Assessment of the Deep Learning Acceleration Stack

Perry Gibson, Jos\'e Cano, Elliot J. Crowley, Amos Storkey, Michael, O'Boyle

PDF

Open Access

TL;DR

This paper introduces DLAS, a comprehensive framework combining machine learning and systems techniques to analyze and optimize deep neural network acceleration across hardware and software layers, highlighting complex interactions.

Contribution

It presents DLAS, a unified stack for evaluating DNN acceleration, demonstrating the importance of cross-layer interactions and auto-tuning in optimizing performance and accuracy.

Findings

01

Model size, accuracy, and inference time are not always correlated.

02

Speedups from compression are hardware-dependent.

03

Auto-tuning significantly influences optimal algorithm choice.

Abstract

Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier to their deployment on resource-constrained devices. Since such devices are where many emerging deep learning applications lie (e.g., drones, vision-based medical technology), significant bodies of work from both the machine learning and systems communities have attempted to provide optimizations to accelerate DNNs. To help unify these two perspectives, in this paper we combine machine learning and systems techniques within the Deep Learning Acceleration Stack (DLAS), and demonstrate how these layers can be tightly dependent on each other with an across-stack perturbation study. We evaluate the impact on accuracy and inference time when varying different parameters of DLAS across two datasets, seven popular DNN architectures, four DNN compression techniques, three algorithmic primitives…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications