maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell   GPUs

Andrew Lavin

arXiv:1501.06633·cs.NE·February 3, 2015·32 cites

maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs

Andrew Lavin

PDF

Open Access 1 Repo

TL;DR

maxDNN is a highly efficient convolution kernel optimized for NVIDIA Maxwell GPUs, achieving over 96% computational efficiency in deep learning tasks by combining advanced assembly coding with existing convolution techniques.

Contribution

The paper introduces maxDNN, a novel convolution kernel that significantly improves computational efficiency for deep learning on Maxwell GPUs, leveraging assembly-level optimizations.

Findings

01

Achieves 96.3% efficiency on typical architectures

02

Combines ideas from cuda-convnet2 and Maxas SGEMM assembly

03

Focuses on forward propagation, with potential for backward propagation

Abstract

This paper describes maxDNN, a computationally efficient convolution kernel for deep learning with the NVIDIA Maxwell GPU. maxDNN reaches 96.3% computational efficiency on typical deep learning network architectures. The design combines ideas from cuda-convnet2 with the Maxas SGEMM assembly code. We only address forward propagation (FPROP) operation of the network, but we believe that the same techniques used here will be effective for backward propagation (BPROP) as well.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eBay/maxDNN
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing

MethodsConvolution