Deep Learning Inference on Heterogeneous Mobile Processors: Potentials and Pitfalls
Sicong Liu, Wentao Zhou, Zimu Zhou, Bin Guo, Minfan Wang, Cheng Fang,, Zheng Lin, Zhiwen Yu

TL;DR
This paper empirically evaluates the potential and challenges of deploying deep learning inference on heterogeneous mobile processors, highlighting limitations of current methods and opportunities for optimization in real-world scenarios.
Contribution
It provides a comprehensive empirical analysis of deep learning inference on heterogeneous mobile hardware, revealing practical limitations and suggesting avenues for cross-level optimization.
Findings
Existing techniques face limitations in dynamic mobile environments.
Parallel execution can accelerate inference but has challenges in load balancing.
Opportunities exist for cross-level optimization to improve performance.
Abstract
There is a growing demand to deploy computation-intensive deep learning (DL) models on resource-constrained mobile devices for real-time intelligent applications. Equipped with a variety of processing units such as CPUs, GPUs, and NPUs, the mobile devices hold potential to accelerate DL inference via parallel execution across heterogeneous processors. Various efficient parallel methods have been explored to optimize computation distribution, achieve load balance, and minimize communication cost across processors. Yet their practical effectiveness in the dynamic and diverse real-world mobile environment is less explored. This paper presents a holistic empirical study to assess the capabilities and challenges associated with parallel DL inference on heterogeneous mobile processors. Through carefully designed experiments covering various DL models, mobile software/hardware environments,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
