Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps
Abhishek Sehgal, Nasser Kehtarnavaz

TL;DR
This paper provides unified guidelines and benchmarks for deploying deep learning models as real-time applications on smartphones, focusing on Android and iOS platforms with multi-threading and validation metrics.
Contribution
It introduces a comprehensive deployment framework and benchmarking methodology for real-time deep learning inference on smartphones, applicable to multiple neural network models.
Findings
Multi-threading improves real-time throughput.
Benchmarking includes accuracy, CPU/GPU use, and latency.
Deployment framework simplifies turning models into smartphone apps.
Abstract
Deep learning solutions are being increasingly used in mobile applications. Although there are many open-source software tools for the development of deep learning solutions, there are no guidelines in one place in a unified manner for using these tools towards real-time deployment of these solutions on smartphones. From the variety of available deep learning tools, the most suited ones are used in this paper to enable real-time deployment of deep learning inference networks on smartphones. A uniform flow of implementation is devised for both Android and iOS smartphones. The advantage of using multi-threading to achieve or improve real-time throughputs is also showcased. A benchmarking framework consisting of accuracy, CPU/GPU consumption and real-time throughput is considered for validation purposes. The developed deployment approach allows deep learning models to be turned into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Context-Aware Activity Recognition Systems
Methods1x1 Convolution · Average Pooling · Batch Normalization · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block · Kaiming Initialization
