One weird trick for parallelizing convolutional neural networks

Alex Krizhevsky

arXiv:1404.5997·cs.NE·April 29, 2014·984 cites

One weird trick for parallelizing convolutional neural networks

Alex Krizhevsky

PDF

Open Access 5 Repos 1 Models

TL;DR

This paper introduces a novel parallelization technique for training convolutional neural networks across multiple GPUs, achieving superior scalability compared to existing methods.

Contribution

The paper proposes a new parallelization approach that significantly improves scalability for training modern CNNs on multiple GPUs.

Findings

01

Scales better than all existing methods

02

Effective for modern CNN architectures

03

Reduces training time with multiple GPUs

Abstract

I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
Kalray/alexnet
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Neural Networks and Applications