DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems
Newsha Ardalani, Saptadeep Pal, Puneet Gupta

TL;DR
DeepFlow is a framework that automates cross-layer analysis and optimization in distributed AI systems, significantly improving hardware utilization by addressing inefficiencies across technology, hardware, and software layers.
Contribution
It introduces CrossFlow for cross-layer analysis and DeepFlow for automated co-optimization, a novel approach to enhance system efficiency in large-scale AI training.
Findings
Validated accuracy with real hardware training
Demonstrated pitfalls of non-optimized stack layers
Showcased improved utilization through case studies
Abstract
Over the past decade, machine learning model complexity has grown at an extraordinary rate, as has the scale of the systems training such large models. However there is an alarmingly low hardware utilization (5-20%) in large scale AI systems. The low system utilization is a cumulative effect of minor losses across different layers of the stack, exacerbated by the disconnect between engineers designing different layers spanning across different industries. We propose CrossFlow, a novel framework that enables cross-layer analysis all the way from the technology layer to the algorithmic layer. We also propose DeepFlow (built on top of CrossFlow using machine learning techniques) to automate the design space exploration and co-optimization across different layers of the stack. We have validated CrossFlow accuracy with distributed training on real commercial hardware and showcase several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing · IoT and Edge/Fog Computing
