Fully Empirical Autotuned QR Factorization For Multicore Architectures
Emmanuel Agullo (INRIA Bordeaux - Sud-Ouest, LaBRI), Jack Dongarra, (ICL), Rajib Nath (ICL), Stanimire Tomov (ICL)

TL;DR
This paper presents a fully empirical, automatic tuning method for dense QR factorization on multicore architectures, achieving near-optimal performance quickly without relying on complex models.
Contribution
It introduces a novel empirical approach that efficiently prunes the search space for autotuning QR factorization on multicore systems, enabling fast and reliable performance optimization.
Findings
Tuning completes in less than 10 minutes on most platforms.
Achieves 97-100% of the optimal performance.
Method is applicable to multiple hardware architectures.
Abstract
Tuning numerical libraries has become more difficult over time, as systems get more sophisticated. In particular, modern multicore machines make the behaviour of algorithms hard to forecast and model. In this paper, we tackle the issue of tuning a dense QR factorization on multicore architectures. We show that it is hard to rely on a model, which motivates us to design a fully empirical approach. We exhibit few strong empirical properties that enable us to efficiently prune the search space. Our method is automatic, fast and reliable. The tuning process is indeed fully performed at install time in less than one and ten minutes on five out of seven platforms. We achieve an average performance varying from 97% to 100% of the optimum performance depending on the platform. This work is a basis for autotuning the PLASMA library and enabling easy performance portability across hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Embedded Systems Design Techniques
