Leveraging MLIR for Loop Vectorization and GPU Porting of FFT Libraries
Yifei He, Artur Podobas, Stefano Markidis

TL;DR
This paper introduces FFTc, a DSL that leverages MLIR to optimize FFT library generation, enabling efficient CPU vectorization and GPU porting, with performance comparable to existing libraries.
Contribution
It extends FFTc with new data layout options, sparsification for vectorization, and GPU support, advancing FFT library automation and optimization.
Findings
CPU performance comparable to FFTW due to vectorization
Successful GPU porting with promising initial results
Enhanced data layout and sparsification improve efficiency
Abstract
FFTc is a Domain-Specific Language (DSL) for designing and generating Fast Fourier Transforms (FFT) libraries. The FFTc uniqueness is that it leverages and extend Multi-Level Intermediate Representation (MLIR) dialects to optimize FFT code generation. In this work, we present FFTc extensions and improvements such as the possibility of using different data layout for complex-value arrays, and sparsification to enable efficient vectorization, and a seamless porting of FFT libraries to GPU systems. We show that, on CPUs, thanks to vectorization, the performance of the FFTc-generated FFT is comparable to performance of FFTW, a state-of-the-art FFT libraries. We also present the initial performance results for FFTc on Nvidia GPUs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Filter Design and Implementation · Parallel Computing and Optimization Techniques · Model Reduction and Neural Networks
