Benchmarking State-of-the-Art Deep Learning Software Tools
Shaohuai Shi, Qiang Wang, Pengfei Xu, Xiaowen Chu

TL;DR
This paper provides a comprehensive benchmarking study of leading GPU-accelerated deep learning software tools across various hardware platforms, offering insights for users and developers to optimize performance.
Contribution
It offers the first detailed comparison of popular deep learning tools on multiple hardware setups, guiding users and informing future software optimizations.
Findings
Performance varies significantly across tools and hardware.
GPU acceleration substantially reduces training time.
Benchmark results inform optimal hardware and software choices.
Abstract
Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools. Training a deep network is usually a very time-consuming process. To address the computational challenge in deep learning, many tools exploit hardware features such as multi-core CPUs and many-core GPUs to shorten the training time. However, different tools exhibit different features and running performance when training different types of deep networks on different hardware platforms, which makes it difficult for end users to select an appropriate pair of software and hardware. In this paper, we aim to make a comparative study of the state-of-the-art GPU-accelerated deep learning software tools, including Caffe, CNTK, MXNet, TensorFlow, and Torch. We first benchmark the running performance of these tools with three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices
