Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks
Mingyu Liang, Wenyin Fu, Louis Feng, Zhongyi Lin, Pavani Panakanti,, Shengbao Zheng, Srinivas Sridharan, Christina Delimitrou

TL;DR
Mystique is a scalable framework that uses detailed execution traces to generate accurate, portable, and adaptable AI benchmarks for production environments, addressing representativeness and rapid update challenges.
Contribution
It introduces a novel use of PyTorch execution traces for scalable, accurate, and portable AI benchmark generation in production settings.
Findings
Benchmarks closely match original models in execution time and system metrics.
Generated benchmarks are portable across different platforms.
The framework enables flexible and rapid benchmark creation.
Abstract
Building large AI fleets to support the rapidly growing DL workloads is an active research topic for modern cloud providers. Generating accurate benchmarks plays an essential role in designing the fast-paced software and hardware solutions in this space. Two fundamental challenges to make this scalable are (i) workload representativeness and (ii) the ability to quickly incorporate changes to the fleet into the benchmarks. To overcome these issues, we propose Mystique, an accurate and scalable framework for production AI benchmark generation. It leverages the PyTorch execution trace (ET), a new feature that captures the runtime information of AI models at the granularity of operators, in a graph format, together with their metadata. By sourcing fleet ETs, we can build AI benchmarks that are portable and representative. Mystique is scalable, due to its lightweight data collection, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Machine Learning in Materials Science
