Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao,, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Re, Matei Zaharia

TL;DR
This paper analyzes DAWNBench, a benchmark for end-to-end training time to reach near-state-of-the-art accuracy, revealing insights into its effectiveness, model generalization, hardware utilization, and communication bottlenecks.
Contribution
The study provides an in-depth analysis of DAWNBench entries, evaluating the TTA metric's stability, model generalization, and hardware utilization trends in high-performance deep learning training.
Findings
TTA has a low coefficient of variation, indicating consistent measurement.
Models optimized for TTA generalize nearly as well as standard models.
DAWNBench entries underutilize hardware capabilities like Tensor Cores.
Abstract
Researchers have proposed hardware, software, and algorithmic optimizations to improve the computational performance of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision), and can impact the final model's accuracy on unseen data. Due to a lack of standard evaluation criteria that considers these trade-offs, it is difficult to directly compare these optimizations. To address this problem, we recently introduced DAWNBench, a benchmark competition focused on end-to-end training time to achieve near-state-of-the-art accuracy on an unseen dataset---a combined metric called time-to-accuracy (TTA). In this work, we analyze the entries from DAWNBench, which received optimized submissions from multiple industrial groups, to investigate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
