ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked   Models

Matthias Wess; Matvey Ivanov; Anvesh Nookala; Christoph Unger,; Alexander Wendt; Axel Jantsch

arXiv:2105.03176·cs.LG·May 10, 2021

ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models

Matthias Wess, Matvey Ivanov, Anvesh Nookala, Christoph Unger,, Alexander Wendt, Axel Jantsch

PDF

1 Repo

TL;DR

ANNETTE introduces a stacked model framework for accurately estimating DNN inference latency on hardware accelerators, aiding architectural design without hardware dependence.

Contribution

It proposes a novel stacked modeling approach that improves latency estimation accuracy for diverse neural networks on various hardware accelerators.

Findings

01

Average estimation error of 3.47% on DNNDK

02

Fidelity of 0.988 on NASBench networks

03

Outperforms existing statistical and analytical models

Abstract

With new accelerator hardware for DNN, the computing power for AI applications has increased rapidly. However, as DNN algorithms become more complex and optimized for specific applications, latency requirements remain challenging, and it is critical to find the optimal points in the design space. To decouple the architectural search from the target hardware, we propose a time estimation framework that allows for modeling the inference latency of DNNs on hardware accelerators based on mapping and layer-wise estimation models. The proposed methodology extracts a set of models from micro-kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation. We test the mixed models on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

embedded-machine-learning/annette
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.