Introducing Instruction-Accurate Simulators for Performance Estimation of Autotuning Workloads
Rebecca Pelke, Nils Bosbach, Lennart M. Reimann, Rainer Leupers

TL;DR
This paper introduces a simulation-based autotuning approach for ML workloads that enables scalable performance estimation across diverse hardware architectures, reducing reliance on physical hardware and maintaining high prediction accuracy.
Contribution
It presents an interface for executing autotuning workloads on simulators and demonstrates high prediction accuracy and efficiency in performance estimation.
Findings
Predictions are within top 3% of actual run times on tested architectures.
Simulation-based autotuning can outperform native execution on embedded hardware.
High scalability achieved by running many simulations in parallel.
Abstract
Accelerating Machine Learning (ML) workloads requires efficient methods due to their large optimization space. Autotuning has emerged as an effective approach for systematically evaluating variations of implementations. Traditionally, autotuning requires the workloads to be executed on the target hardware (HW). We present an interface that allows executing autotuning workloads on simulators. This approach offers high scalability when the availability of the target HW is limited, as many simulations can be run in parallel on any accessible HW. Additionally, we evaluate the feasibility of using fast instruction-accurate simulators for autotuning. We train various predictors to forecast the performance of ML workload implementations on the target HW based on simulation statistics. Our results demonstrate that the tuned predictors are highly effective. The best workload implementation in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Advanced Neural Network Applications
