MAPLE-Edge: A Runtime Latency Predictor for Edge Devices

Saeejith Nair; Saad Abbasi; Alexander Wong; Mohammad Javad Shafiee

arXiv:2204.12950·cs.LG·April 28, 2022

MAPLE-Edge: A Runtime Latency Predictor for Edge Devices

Saeejith Nair, Saad Abbasi, Alexander Wong, Mohammad Javad Shafiee

PDF

Open Access

TL;DR

MAPLE-Edge is a latency prediction model tailored for edge devices that uses minimal runtime data to accurately estimate neural network latency, facilitating efficient NAS for optimized edge deployment.

Contribution

It extends MAPLE to edge devices by using a small set of CPU counters and normalization techniques, achieving significant accuracy improvements over prior methods.

Findings

01

Up to +49.6% accuracy over baseline methods.

02

Effective generalization across diverse runtimes.

03

Additional samples improve accuracy by nearly +40%.

Abstract

Neural Architecture Search (NAS) has enabled automatic discovery of more efficient neural network architectures, especially for mobile and embedded vision applications. Although recent research has proposed ways of quickly estimating latency on unseen hardware devices with just a few samples, little focus has been given to the challenges of estimating latency on runtimes using optimized graphs, such as TensorRT and specifically for edge devices. In this work, we propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware, where we train a regression network on architecture-latency pairs in conjunction with a hardware-runtime descriptor to effectively estimate latency on a diverse pool of edge devices. Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning in Materials Science · Parallel Computing and Optimization Techniques