ILP-M Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs
Zhuoran Ji

TL;DR
This paper introduces the ILP-M Conv algorithm, a novel approach optimized for single-image CNN inference on mobile GPUs, achieving significant speedups over existing methods.
Contribution
The paper proposes the HNTMP convolution algorithm, specifically designed for single-image inference on mobile GPUs, outperforming traditional algorithms in speed.
Findings
14.6x speedup over im2col convolution
2.30x speedup over existing direct convolution
Effective optimization for mobile GPU inference
Abstract
Convolution neural networks are widely used for mobile applications. However, GPU convolution algorithms are designed for mini-batch neural network training, the single-image convolution neural network inference algorithm on mobile GPUs is not well-studied. After discussing the usage difference and examining the existing convolution algorithms, we proposed the HNTMP convolution algorithm. The HNTMP convolution algorithm achieves speedup than the most popular \textit{im2col} convolution algorithm, and speedup than the fastest existing convolution algorithm (direct convolution) as far as we know.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Image Enhancement Techniques
MethodsConvolution
