TBD: Benchmarking and Analyzing Deep Neural Network Training

Hongyu Zhu; Mohamed Akrout; Bojian Zheng; Andrew Pelegris; Amar; Phanishayee; Bianca Schroeder; and Gennady Pekhimenko

arXiv:1803.06905·cs.LG·April 17, 2018·55 cites

TBD: Benchmarking and Analyzing Deep Neural Network Training

Hongyu Zhu, Mohamed Akrout, Bojian Zheng, Andrew Pelegris, Amar, Phanishayee, Bianca Schroeder, and Gennady Pekhimenko

PDF

Open Access

TL;DR

This paper introduces a comprehensive benchmark called TBD for evaluating DNN training across various models and applications, along with tools for performance and memory analysis on multiple frameworks and hardware setups.

Contribution

It proposes a new broad benchmark for DNN training covering diverse applications and provides a detailed performance and memory analysis toolkit for major frameworks and hardware configurations.

Findings

01

Identified key bottlenecks in DNN training performance.

02

Provided insights into memory consumption patterns during training.

03

Recommended optimization directions for future DNN training research.

Abstract

The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus is usually very narrow and limited to (i) inference -- i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation. Our primary goal in this work is to break this myopic view by (i) proposing a new benchmark for DNN training, called TBD (TBD is short for Training Benchmark for DNNs), that uses a representative set of DNN models that cover a wide range of machine learning applications: image classification, machine translation, speech recognition, object detection, adversarial networks, reinforcement learning, and (ii) by performing an extensive performance analysis of training these different applications on three major deep learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning