Violet: Architecturally Exposed Orchestration, Movement, and Placement for Generalized Deep Learning
Michael Davies, Karthikeyan Sankaralingam

TL;DR
Violet is a novel architecture inspired by multicore SIMD that balances data orchestration, movement, and work placement to improve deep learning performance and efficiency across diverse applications, outperforming NVIDIA's TensorCore-based systems.
Contribution
The paper introduces Violet, a new architecture that effectively balances key deep learning challenges, demonstrating significant performance and efficiency improvements over existing GPU systems.
Findings
Violet achieves 2.4X/10.6X performance/efficiency for inference.
Violet achieves 2.1X/9.5X performance/efficiency for training.
Operator-level analysis reveals key behaviors influencing speedup.
Abstract
Deep learning and hardware for it has garnered immense academic and industry interest in the past 5 years, with many novel proposals. However, the state-of-art remains NVIDIA's TensorCore-based systems that provide top-of-line performance and coverage across a wide-spectrum of deep learning applications. In this paper, we first identify four key problems any new DL solution must solve: 1) Data orchestration, 2) Data movement, 3) Work placement and blending these to achieve 4) Coverage across different types of DL applications. With this as a guide, we propose Violet, a novel architecture with roots in multicore SIMD which balances the responsibilities for these four problems between the architecture, microarchitecture and software stack. Compared to the NVIDIA A100 GPU, we find Violet achieves geo-mean 2.4X/10.6X and 2.1X/9.5X performance/efficiency for inference and training across the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
