IAAT: A Input-Aware Adaptive Tuning framework for Small GEMM
Jianyu Yao, Boqian Shi, Chunyang Xiang, Haipeng Jia, Chendi Li, Hang, Cao, Yunquan Zhang

TL;DR
This paper introduces IAAT, an adaptive framework that optimizes small GEMM computations by reducing boundary processing and pack operations, significantly improving performance on ARMv8 platforms.
Contribution
The paper presents a novel input-aware adaptive tuning framework for small GEMM that dynamically selects optimized kernels and tiling strategies to enhance performance.
Findings
IAAT outperforms existing BLAS libraries on ARMv8.
It reduces boundary processing and pack operation costs.
Provides a flexible, kernel-based approach for small GEMM optimization.
Abstract
GEMM with the small size of input matrices is becoming widely used in many fields like HPC and machine learning. Although many famous BLAS libraries already supported small GEMM, they cannot achieve near-optimal performance. This is because the costs of pack operations are high and frequent boundary processing cannot be neglected. This paper proposes an input-aware adaptive tuning framework(IAAT) for small GEMM to overcome the performance bottlenecks in state-of-the-art implementations. IAAT consists of two stages, the install-time stage and the run-time stage. In the run-time stage, IAAT tiles matrices into blocks to alleviate boundary processing. This stage utilizes an input-aware adaptive tile algorithm and plays the role of runtime tuning. In the install-time stage, IAAT auto-generates hundreds of kernels of different sizes to remove pack operations. Finally, IAAT finishes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Algorithms and Data Compression
