tritonBLAS: Triton-based Analytical Approach for GEMM Kernel Parameter Selection
Ryan Swann, Muhammad Osama, Xiaohu Guo, Bryant Nelson, Lixun Zhang, Alex Brown, Yen Ong, Ali Yazdani, Sean Siddens, Ganesh Dasika, Alex Underwood

TL;DR
tritonBLAS is an analytical model that predicts near-optimal GPU GEMM kernel configurations using architectural parameters, eliminating the need for runtime autotuning and enabling efficient, practical deployment in HPC and ML applications.
Contribution
We introduce tritonBLAS, an analytical approach that models GPU architecture to select GEMM kernel parameters without autotuning, achieving high performance with zero autotuning overhead.
Findings
Achieves over 95% of autotuning performance
Reduces autotuning time to zero
Works across diverse GEMM problem sizes
Abstract
We present tritonBLAS, a fast and deterministic analytical model that uses architectural parameters like the cache hierarchy, and relative code and data placement to generate performant GPU GEMM kernels. tritonBLAS explicitly models the relationship between architectural topology, matrix shapes, and algorithmic blocking behavior to predict near-optimal configurations without runtime autotuning. Based on this model, we developed and implemented a lightweight GEMM framework entirely within Triton. We evaluate the performance of tritonBLAS across a diverse set of GEMM problem sizes on modern GPUs. tritonBLAS achieves over 95% of the performance of autotuning solutions, while reducing autotuning time to zero. This makes tritonBLAS a practical drop-in replacement for empirical tuning in production HPC and ML workloads.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies
