PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference   Engine for Mobile Phones

Gang Chen; Shengyu He; Haitao Meng; Kai Huang

arXiv:1912.04050·cs.DC·December 10, 2019·1 cites

PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones

Gang Chen, Shengyu He, Haitao Meng, Kai Huang

PDF

Open Access

TL;DR

PhoneBit is a GPU-accelerated inference engine designed specifically for mobile phones, enabling efficient deployment of binary neural networks by optimizing data layout, bit packing, and layer integration to maximize performance and energy efficiency.

Contribution

This paper introduces PhoneBit, a novel GPU-accelerated BNN inference engine for mobile devices that employs operator-level optimizations tailored for mobile GPU architectures.

Findings

01

Achieves significant speedup over existing frameworks.

02

Demonstrates improved energy efficiency on mobile GPUs.

03

Effectively supports binary versions of AlexNet, YOLOv2 Tiny, and VGG16.

Abstract

Over the last years, a great success of deep neural networks (DNNs) has been witnessed in computer vision and other fields. However, performance and power constraints make it still challenging to deploy DNNs on mobile devices due to their high computational complexity. Binary neural networks (BNNs) have been demonstrated as a promising solution to achieve this goal by using bit-wise operations to replace most arithmetic operations. Currently, existing GPU-accelerated implementations of BNNs are only tailored for desktop platforms. Due to architecture differences, mere porting of such implementations to mobile devices yields suboptimal performance or is impossible in some cases. In this paper, we propose PhoneBit, a GPU-accelerated BNN inference engine for Android-based mobile devices that fully exploits the computing power of BNNs on mobile GPUs. PhoneBit provides a set of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Context-Aware Activity Recognition Systems

MethodsEthereum Customer Service Number +1-833-534-1729 · Average Pooling · Global Average Pooling · 1x1 Convolution · Batch Normalization · Convolution · Darknet-19 · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia?