Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance   Benchmark

Cody Coleman; Daniel Kang; Deepak Narayanan; Luigi Nardi; Tian Zhao,; Jian Zhang; Peter Bailis; Kunle Olukotun; Chris Re; Matei Zaharia

arXiv:1806.01427·cs.LG·December 3, 2019

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

Cody Coleman, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao,, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Re, Matei Zaharia

PDF

TL;DR

This paper analyzes DAWNBench, a benchmark for end-to-end training time to reach near-state-of-the-art accuracy, revealing insights into its effectiveness, model generalization, hardware utilization, and communication bottlenecks.

Contribution

The study provides an in-depth analysis of DAWNBench entries, evaluating the TTA metric's stability, model generalization, and hardware utilization trends in high-performance deep learning training.

Findings

01

TTA has a low coefficient of variation, indicating consistent measurement.

02

Models optimized for TTA generalize nearly as well as standard models.

03

DAWNBench entries underutilize hardware capabilities like Tensor Cores.

Abstract

Researchers have proposed hardware, software, and algorithmic optimizations to improve the computational performance of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision), and can impact the final model's accuracy on unseen data. Due to a lack of standard evaluation criteria that considers these trade-offs, it is difficult to directly compare these optimizations. To address this problem, we recently introduced DAWNBench, a benchmark competition focused on end-to-end training time to achieve near-state-of-the-art accuracy on an unseen dataset---a combined metric called time-to-accuracy (TTA). In this work, we analyze the entries from DAWNBench, which received optimized submissions from multiple industrial groups, to investigate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.