Flint: Compiler Enabled Cluster-Free Design Space Exploration for Distributed ML
Jinsun Yoo, Meghan Cowan, Zheng Du, Changhai Man, Srinivas Sridharan, Tushar Krishna

TL;DR
Flint is a compiler-based framework that enables flexible design space exploration for distributed machine learning systems by leveraging compiler intermediate representations, without requiring actual hardware execution.
Contribution
It introduces a novel approach that uses compiler IRs to facilitate cluster-free exploration of distributed ML workloads across different system configurations.
Findings
Workload representations are validated against execution traces.
Flint enables exploration across arbitrary cluster sizes.
Case study demonstrates the framework's flexibility.
Abstract
Design space exploration for future distributed Machine Learning systems suffers from a lack of readily available workload representation that enables flexible exploration across the stack. We present Flint, a framework that bridges this gap by leveraging the Intermediate Representation of Machine Learning framework compilers. The compiler does the heavy weight lifting of understanding and preserving the behavior of the original model code. Flint can collect the workload representation of arbitrary cluster size because it interfaces with the compiler before hardware execution. We validate the workload graph against post-execution traces and show the flexibility of Flint through a design space exploration case study.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
