
TL;DR
This paper discusses implementing overlap fermions on NVIDIA GPUs using CUDA, focusing on algorithms, implementation details, and performance evaluation.
Contribution
It presents a GPU-based implementation of overlap fermions, optimizing algorithms for high-performance lattice QCD computations.
Findings
Achieved significant speedup over CPU implementations
Detailed implementation strategies for GPU optimization
Performance benchmarks demonstrating efficiency
Abstract
We report on our efforts to implement overlap fermions on NVIDIA GPUs using CUDA, commenting on the algorithms used, implemetation details, and the performance of our code.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
