ImageNet Training in Minutes

Yang You; Zhao Zhang; Cho-Jui Hsieh; James Demmel; Kurt Keutzer

arXiv:1709.05011·cs.CV·February 1, 2018·45 cites

ImageNet Training in Minutes

Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that large-batch synchronous SGD with LARS enables training ResNet-50 on ImageNet in just 14 minutes without losing accuracy, significantly accelerating deep learning training times.

Contribution

The authors introduce a large-batch training approach using LARS that achieves near real-time ImageNet training speeds on large-scale hardware without sacrificing accuracy.

Findings

01

ResNet-50 training completed in 14 minutes with 74.9% accuracy.

02

AlexNet trained in 11 minutes on 1024 CPUs.

03

Large batch sizes above 16K improve accuracy over previous methods.

Abstract

Finishing 90-epoch ImageNet-1k training with ResNet-50 on a NVIDIA M40 GPU takes 14 days. This training requires 10^18 single precision operations in total. On the other hand, the world's current fastest supercomputer can finish 2 * 10^17 single precision operations per second (Dongarra et al 2017, https://www.top500.org/lists/2017/06/). If we can make full use of the supercomputer for DNN training, we should be able to finish the 90-epoch ResNet-50 training in one minute. However, the current bottleneck for fast DNN training is in the algorithm level. Specifically, the current batch size (e.g. 512) is too small to make efficient use of many processors. For large-scale DNN training, we focus on using large-batch data-parallelism synchronous SGD without losing accuracy in the fixed epochs. The LARS algorithm (You, Gitman, Ginsburg, 2017, arXiv:1708.03888) enables us to scale the batch…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fuentesdt/livermask
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · LARS