Tuning Technique for Multiple Precision Dense Matrix Multiplication using Prediction of Computational Time
Tomonori Kouya

TL;DR
This paper proposes a prediction-based tuning technique to optimize multiple precision dense matrix multiplication, significantly reducing tuning time across various precisions, sizes, and parallelization levels.
Contribution
It introduces a novel method that predicts computational times for different block sizes to efficiently select optimal matrix multiplication configurations.
Findings
Prediction-based tuning reduces optimization time.
Method achieves efficient performance across multiple precisions.
Applicable to various matrix sizes and parallelization levels.
Abstract
Although reliable long precision floating-point arithmetic libraries such as QD and MPFR/GMP are necessary to solve ill-conditioned problems in numerical simulation, long precision BLAS-level computation such as matrix multiplication has not been fully optimized because tuning costs are very high compared to IEEE float and double precision arithmetic. In this study, we develop a technique to shorten this tuning time by using prediction of computational times in several block sizes for the blocking algorithm, and then selecting the fastest matrix multiplication method for tuning multiple precision dense real matrix multiplication in various precisions, matrix sizes, and degrees of parallelization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Matrix Theory and Algorithms · Advanced Data Storage Technologies
