Evaluating Cross-Architecture Performance Modeling of Distributed ML Workloads Using StableHLO
Jonas Svedas, Nathan Laubeuf, Ryan Harvey, Arjun Singh, Changhai Man, Abubakr Nada, Tushar Krishna, James Myers, Debjyoti Bhattacharjee

TL;DR
This paper explores using MLIR's StableHLO dialect as a unified representation for cross-architecture performance modeling of distributed ML workloads, enabling portable and comparative analysis across GPUs and TPUs.
Contribution
It introduces a StableHLO-based simulation methodology that maps a single workload onto multiple performance models, facilitating cross-platform and fidelity comparisons without physical hardware.
Findings
StableHLO preserves relative performance trends across architectures.
Prediction errors are within practical bounds for early-stage design.
Fidelity-dependent limitations are exposed in existing GPU simulators.
Abstract
Predicting the performance of large-scale distributed machine learning (ML) workloads across multiple accelerator architectures remains a central challenge in ML system design. Existing GPU and TPU focused simulators are typically architecture-specific, while distributed training simulators rely on workload-specific analytical models or costly post-execution traces, limiting portability and cross-platform comparison. This work evaluates whether MLIR's StableHLO dialect can serve as a unified workload representation for cross-architecture and cross-fidelity performance modeling of distributed ML workloads. The study establishes a StableHLO-based simulation methodology that maps a single workload representation onto multiple performance models, spanning analytical, profiling-based, and simulator-driven predictors. Using this methodology, workloads are evaluated across GPUs and TPUs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
