Cloud-based or On-device: An Empirical Study of Mobile Deep Inference

Tian Guo

arXiv:1707.04610·cs.PF·March 1, 2019·2 cites

Cloud-based or On-device: An Empirical Study of Mobile Deep Inference

Tian Guo

PDF

Open Access

TL;DR

This paper empirically compares cloud-based and on-device deep learning inference on mobile devices, revealing significant performance and energy trade-offs, and identifying key bottlenecks in on-device inference.

Contribution

It provides an empirical evaluation of mobile deep inference performance, highlighting the feasibility challenges of on-device inference compared to cloud-based solutions.

Findings

01

On-device inference can be up to 100 times slower than cloud-based inference.

02

Loading models and computing probabilities are major bottlenecks.

03

On-device inference consumes significantly more energy than cloud-based inference.

Abstract

Modern mobile applications are benefiting significantly from the advancement in deep learning, e.g., implementing real-time image recognition and conversational system. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of computational complexity and size constraints, these trained models are often hosted in the cloud. To utilize these cloud-based models, mobile apps will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it restricts the use case scenarios, e.g. mobile apps need to have network access. With mobile specific deep learning optimizations, it is now possible to employ on-device inference. However, because mobile hardware, such as GPU and memory size, can be very limited…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Green IT and Sustainability · Age of Information Optimization