I/O Lower Bounds for Auto-tuning of Convolutions in CNNs

Xiaoyang Zhang; Junmin Xiao; Guangming Tan

arXiv:2012.15667·cs.LG·January 1, 2021·1 cites

I/O Lower Bounds for Auto-tuning of Convolutions in CNNs

Xiaoyang Zhang, Junmin Xiao, Guangming Tan

PDF

Open Access

TL;DR

This paper develops I/O lower bounds for CNN convolution algorithms, designs near-optimal dataflow strategies, and employs auto-tuning to significantly improve GPU performance over existing methods like cuDNN and TVM.

Contribution

It introduces a comprehensive I/O lower bound theory for CNN convolutions, and applies it to optimize dataflow and auto-tuning strategies for direct and Winograd algorithms on GPUs.

Findings

01

Achieves 3.32x speedup over cuDNN

02

Faster auto-tuning than TVM

03

Higher performance than TVM's optimal configurations

Abstract

Convolution is the most time-consuming part in the computation of convolutional neural networks (CNNs), which have achieved great successes in numerous applications. Due to the complex data dependency and the increase in the amount of model samples, the convolution suffers from high overhead on data movement (i.e., memory access). This work provides comprehensive analysis and methodologies to minimize the communication for the convolution in CNNs. With an in-depth analysis of the recent I/O complexity theory under the red-blue game model, we develop a general I/O lower bound theory for a composite algorithm which consists of several different sub-computations. Based on the proposed theory, we establish the data movement lower bound results of two representative convolution algorithms in CNNs, namely the direct convolution and Winograd algorithm. Next, derived from I/O lower bound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques

MethodsConvolution