Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7   seconds

Masafumi Yamazaki; Akihiko Kasagi; Akihiro Tabuchi; Takumi Honda,; Masahiro Miwa; Naoto Fukumoto; Tsuguchika Tabaru; Atsushi Ike; Kohta; Nakashima

arXiv:1903.12650·cs.LG·April 1, 2019·70 cites

Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds

Masafumi Yamazaki, Akihiko Kasagi, Akihiro Tabuchi, Takumi Honda,, Masahiro Miwa, Naoto Fukumoto, Tsuguchika Tabaru, Atsushi Ike, Kohta, Nakashima

PDF

Open Access

TL;DR

This paper presents a novel optimization approach enabling ResNet-50 to be trained on ImageNet in just 74.7 seconds using 2,048 GPUs, significantly advancing the speed of deep learning training.

Contribution

It introduces new optimization methods that enable ultra-fast distributed training of deep neural networks on large GPU clusters.

Findings

01

Training time of 74.7 seconds for ResNet-50 on ImageNet

02

Achieved training throughput of over 1.73 million images/sec

03

Top-1 validation accuracy of 75.08%

Abstract

There has been a strong demand for algorithms that can execute machine learning as faster as possible and the speed of deep learning has accelerated by 30 times only in the past two years. Distributed deep learning using the large mini-batch is a key technology to address the demand and is a great challenge as it is difficult to achieve high scalability on large clusters without compromising accuracy. In this paper, we introduce optimization methods which we applied to this challenge. We achieved the training time of 74.7 seconds using 2,048 GPUs on ABCI cluster applying these methods. The training throughput is over 1.73 million images/sec and the top-1 validation accuracy is 75.08%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Image Segmentation Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings