ILP-M Conv: Optimize Convolution Algorithm for Single-Image Convolution   Neural Network Inference on Mobile GPUs

Zhuoran Ji

arXiv:1909.02765·cs.DC·October 4, 2019·1 cites

ILP-M Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs

Zhuoran Ji

PDF

Open Access 1 Repo

TL;DR

This paper introduces the ILP-M Conv algorithm, a novel approach optimized for single-image CNN inference on mobile GPUs, achieving significant speedups over existing methods.

Contribution

The paper proposes the HNTMP convolution algorithm, specifically designed for single-image inference on mobile GPUs, outperforming traditional algorithms in speed.

Findings

01

14.6x speedup over im2col convolution

02

2.30x speedup over existing direct convolution

03

Effective optimization for mobile GPU inference

Abstract

Convolution neural networks are widely used for mobile applications. However, GPU convolution algorithms are designed for mini-batch neural network training, the single-image convolution neural network inference algorithm on mobile GPUs is not well-studied. After discussing the usage difference and examining the existing convolution algorithms, we proposed the HNTMP convolution algorithm. The HNTMP convolution algorithm achieves $14.6 \times$ speedup than the most popular \textit{im2col} convolution algorithm, and $2.30 \times$ speedup than the fastest existing convolution algorithm (direct convolution) as far as we know.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jizhuoran/sj_convolution
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Image Enhancement Techniques

MethodsConvolution