LETI: Latency Estimation Tool and Investigation of Neural Networks inference on Mobile GPU
Evgeny Ponomarev, Sergey Matveev, Ivan Oseledets

TL;DR
This paper introduces LETI, a tool for accurate latency estimation of neural network inference on mobile GPUs, addressing the limitations of FLOPs-based proxies and lookup tables, enabling better neural architecture search and performance analysis.
Contribution
The authors develop an open-source latency estimation tool specifically for mobile GPUs, with a data-driven regression model for precise latency prediction tailored to individual devices.
Findings
Latency prediction models achieve good accuracy on target devices.
The approach is validated on NAS-Benchmark 101 and popular neural architectures.
LETI facilitates neural architecture search and large-scale latency evaluation.
Abstract
A lot of deep learning applications are desired to be run on mobile devices. Both accuracy and inference time are meaningful for a lot of them. While the number of FLOPs is usually used as a proxy for neural network latency, it may be not the best choice. In order to obtain a better approximation of latency, research community uses look-up tables of all possible layers for latency calculation for the final prediction of the inference on mobile CPU. It requires only a small number of experiments. Unfortunately, on mobile GPU this method is not applicable in a straight-forward way and shows low precision. In this work, we consider latency approximation on mobile GPU as a data and hardware-specific problem. Our main goal is to construct a convenient latency estimation tool for investigation(LETI) of neural network inference and building robust and accurate latency prediction models for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Human Pose and Action Recognition
