Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs
Stefan Hadjis, Ce Zhang, Ioannis Mitliagkas, Dan Iter, Christopher, R\'e

TL;DR
This paper introduces Omnivore, a system that optimizes multi-device deep learning training on CPUs and GPUs by improving throughput, tuning asynchronous parallelization, and efficiently allocating resources, resulting in significantly faster training times.
Contribution
Omnivore provides a novel understanding of system and optimization interactions, and offers an efficient hyperparameter optimizer that enhances training speed over existing systems.
Findings
Achieves at least 5.5x throughput improvement on CPUs.
Provides a hyperparameter optimizer that reduces training time by up to 12x.
Demonstrates that tuning asynchronous parallelization is crucial for optimal performance.
Abstract
We study the factors affecting training time in multi-device deep learning systems. Given a specification of a convolutional neural network, our goal is to minimize the time to train this model on a cluster of commodity CPUs and GPUs. We first focus on the single-node setting and show that by using standard batching and data-parallel techniques, throughput can be improved by at least 5.5x over state-of-the-art systems on CPUs. This ensures an end-to-end training speed directly proportional to the throughput of a device regardless of its underlying hardware, allowing each node in the cluster to be treated as a black box. Our second contribution is a theoretical and empirical study of the tradeoffs affecting end-to-end training time in a multiple-device setting. We identify the degree of asynchronous parallelization as a key factor affecting both hardware and statistical efficiency. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Advanced Memory and Neural Computing
