Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data
Thorsten Kurth, Jian Zhang, Nadathur Satish, Ioannis Mitliagkas, Evan, Racah, Mostofa Ali Patwary, Tareq Malas, Narayanan Sundaram, Wahid Bhimji,, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan,, Prabhat, Pradeep Dubey

TL;DR
This paper introduces a 15-PetaFLOP deep learning system capable of high-accuracy scientific data classification, demonstrating scalable training on HPC architectures for high-energy physics and climate data analysis.
Contribution
It presents the first large-scale deep learning system at 15PFLOP, with novel architectures for scientific data classification and scalable training strategies on HPC systems.
Findings
Achieved peak performance of over 15 PFLOP/s on HPC systems.
Produced state-of-the-art classification accuracy on high-energy physics data.
Successfully extracted weather patterns from large climate datasets.
Abstract
This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation obtains 2TFLOP/s on a single Cori Phase-II Xeon-Phi node. We use a hybrid strategy employing synchronous node-groups, while using asynchronous communication across groups. We use this strategy to scale training of a single model to 9600 Xeon-Phi nodes; obtaining peak performance of 11.73-15.07 PFLOP/s and sustained performance of 11.41-13.27 PFLOP/s. At scale, our HEP architecture produces state-of-the-art classification accuracy on a dataset with 10M images, exceeding that achieved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Computational Physics and Python Applications · Generative Adversarial Networks and Image Synthesis
