Guidelines and Benchmarks for Deployment of Deep Learning Models on   Smartphones as Real-Time Apps

Abhishek Sehgal; Nasser Kehtarnavaz

arXiv:1901.02144·cs.LG·January 9, 2019·5 cites

Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps

Abhishek Sehgal, Nasser Kehtarnavaz

PDF

Open Access 1 Repo

TL;DR

This paper provides unified guidelines and benchmarks for deploying deep learning models as real-time applications on smartphones, focusing on Android and iOS platforms with multi-threading and validation metrics.

Contribution

It introduces a comprehensive deployment framework and benchmarking methodology for real-time deep learning inference on smartphones, applicable to multiple neural network models.

Findings

01

Multi-threading improves real-time throughput.

02

Benchmarking includes accuracy, CPU/GPU use, and latency.

03

Deployment framework simplifies turning models into smartphone apps.

Abstract

Deep learning solutions are being increasingly used in mobile applications. Although there are many open-source software tools for the development of deep learning solutions, there are no guidelines in one place in a unified manner for using these tools towards real-time deployment of these solutions on smartphones. From the variety of available deep learning tools, the most suited ones are used in this paper to enable real-time deployment of deep learning inference networks on smartphones. A uniform flow of implementation is devised for both Android and iOS smartphones. The advantage of using multi-threading to achieve or improve real-time throughputs is also showcased. A benchmarking framework consisting of accuracy, CPU/GPU consumption and real-time throughput is considered for validation purposes. The developed deployment approach allows deep learning models to be turned into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SIP-Lab/Deep-Learning-Mobile
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Context-Aware Activity Recognition Systems

Methods1x1 Convolution · Average Pooling · Batch Normalization · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block · Kaiming Initialization