Communication Bounds for Convolutional Neural Networks

Anthony Chen; James Demmel; Grace Dinh; Mason Haberle; Olga Holtz

arXiv:2204.08279·cs.DC·July 14, 2022

Communication Bounds for Convolutional Neural Networks

Anthony Chen, James Demmel, Grace Dinh, Mason Haberle, Olga Holtz

PDF

TL;DR

This paper establishes new theoretical lower bounds on data movement for CNN convolutions and introduces optimized algorithms that significantly improve performance on hardware accelerators.

Contribution

It provides novel lower bounds for data movement in CNNs and develops algorithms that outperform existing implementations like Im2Col.

Findings

01

Performance improvements of 13% to 150% over existing algorithms

02

New lower bounds on data movement for mixed precision convolutions

03

Enhanced algorithms for both single-processor and distributed models

Abstract

Convolutional neural networks (CNNs) are important in a wide variety of machine learning tasks and applications, so optimizing their performance is essential. Moving words of data between levels of a memory hierarchy or between processors on a network is much more expensive than the cost of arithmetic, so minimizing communication is critical to optimizing performance. In this paper, we present new lower bounds on data movement for mixed precision convolutions in both single-processor and parallel distributed memory models, as well as algorithms that outperform current implementations such as Im2Col. We obtain performance figures using GEMMINI, a machine learning accelerator, where our tiling provides improvements between 13% and 150% over a vendor supplied algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.