Automatic Library Generation for Modular Polynomial Multiplication
Lingchuan Meng

TL;DR
This paper presents automated methods for generating optimized polynomial multiplication libraries using FFT and TFT, achieving performance comparable or superior to hand-tuned code through intelligent search and autotuning.
Contribution
It introduces an automated system for generating and tuning polynomial multiplication algorithms optimized for modern hardware, improving both theoretical and practical performance.
Findings
Autotuned implementations match or outperform hand-tuned code.
The system effectively optimizes for memory, vectorization, and multi-threading.
Performance improvements are demonstrated on various architectures.
Abstract
Polynomial multiplication is a key algorithm underlying computer algebra systems (CAS) and its efficient implementation is crucial for the performance of CAS. In this paper we design and implement algorithms for polynomial multiplication using approaches based the fast Fourier transform (FFT) and the truncated Fourier transform (TFT). We improve on the state-of-the-art in both theoretical and practical performance. The {\SPIRAL} library generation system is extended and used to automatically generate and tune the performance of a polynomial multiplication library that is optimized for memory hierarchy, vectorization and multi-threading, using new and existing algorithms. The performance tuning has been aided by the use of automation where many code choices are generated and intelligent search is utilized to find the "best" implementation on a given architecture. The performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Numerical Methods and Algorithms · Polynomial and algebraic computation
