Fast Convolutional Nets With fbfft: A GPU Performance Evaluation
Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala,, Serkan Piantino, Yann LeCun

TL;DR
This paper evaluates GPU performance for CNN training, introducing two FFT-based convolution methods that outperform NVIDIA's cuDNN in speed, with detailed analysis of their efficiency across different scenarios.
Contribution
The paper presents two new FFT-based convolution implementations, fbfft and cuFFT-based, that significantly improve CNN training speed on NVIDIA GPUs, and provides detailed performance analysis.
Findings
fbfft outperforms cuFFT by over 1.5x in CNN training
Both implementations surpass cuDNN in speed for many layers, up to 23.5x
Analysis of when time domain convolutions are preferable to Fourier domain
Abstract
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. Both of these convolution implementations are available in open source, and are faster than NVIDIA's cuDNN implementation for many common convolutional layers (up to 23.5x for some synthetic kernel configurations). We discuss different performance regimes of convolutions, comparing areas where straightforward time domain convolutions outperform Fourier frequency domain convolutions. Details on algorithmic applications of NVIDIA GPU hardware specifics in the implementation of fbfft are also provided.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Seismic Imaging and Inversion Techniques · Generative Adversarial Networks and Image Synthesis
MethodsConvolution
