
TL;DR
This paper presents a cache-optimized truncated FFT and inverse FFT tailored for large coefficients, improving performance in polynomial multiplication algorithms like Schonhage--Strassen.
Contribution
It introduces a new cache-friendly approach to truncated FFTs and provides two implementations with performance analysis.
Findings
Enhanced cache efficiency in truncated FFT computations
Two implementations demonstrating performance gains
Applicability to large coefficient polynomial multiplication
Abstract
We describe a cache-friendly version of van der Hoeven's truncated FFT and inverse truncated FFT, focusing on the case of `large' coefficients, such as those arising in the Schonhage--Strassen algorithm for multiplication in Z[x]. We describe two implementations and examine their performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms · Cryptography and Residue Arithmetic · Digital Filter Design and Implementation
