MLKAPS: Machine Learning and Adaptive Sampling for HPC Kernel Auto-tuning
Mathys Jam (LI-PaRAD, UVSQ), Eric Petit, Pablo de Oliveira Castro, (LI-PaRAD, UVSQ), David Defour (LAMPS, UPVD), Greg Henry, William Jalby, (LI-PaRAD, UVSQ)

TL;DR
MLKAPS is an automated tool that uses machine learning and adaptive sampling to optimize HPC kernel configurations, significantly reducing manual effort and outperforming existing auto-tuning methods in speedup and tuning time.
Contribution
MLKAPS introduces a scalable approach combining machine learning and adaptive sampling to automate HPC kernel auto-tuning, outperforming state-of-the-art tools.
Findings
Achieves over 85% input coverage with significant speedups
Outperforms existing auto-tuning tools in tuning time and speedup
Identifies manual tuning blindspots in HPC libraries
Abstract
Many High-Performance Computing (HPC) libraries rely on decision trees to select the best kernel hyperparameters at runtime,depending on the input and environment. However, finding optimized configurations for each input and environment is challengingand requires significant manual effort and computational resources. This paper presents MLKAPS, a tool that automates this task usingmachine learning and adaptive sampling techniques. MLKAPS generates decision trees that tune HPC kernels' design parameters toachieve efficient performance for any user input. MLKAPS scales to large input and design spaces, outperforming similar state-of-the-artauto-tuning tools in tuning time and mean speedup. We demonstrate the benefits of MLKAPS on the highly optimized Intel MKLdgetrf LU kernel and show that MLKAPS finds blindspots in the manual tuning of HPC experts. It improves over 85% of the inputswith…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
