PyExaFMM: an exercise in designing high-performance software with Python and Numba
Srinath Kailasa, Tingyu Wang, Lorena A. Barba, Timo Betcke

TL;DR
This paper discusses the development of PyExaFMM, a high-performance, multithreaded Python implementation of the Fast Multipole Method using Numba, highlighting the challenges and potential of Python for complex scientific algorithms.
Contribution
It demonstrates that high-performance, multithreaded scientific algorithms can be effectively implemented in Python using Numba, despite significant development challenges.
Findings
Numba enables CPU resource utilization comparable to C++ for complex algorithms
Developing performant Numba code for complex data structures is challenging
Python with Numba can be a viable platform for high-performance scientific computing
Abstract
Numba is a game-changing compiler for high-performance computing with Python. It produces machine code that runs outside of the single-threaded Python interpreter and that fully utilizes the resources of modern CPUs. This means support for parallel multithreading and auto vectorization if available, as with compiled languages such as C++ or Fortran. In this article we document our experience developing PyExaFMM, a multithreaded Numba implementation of the Fast Multipole Method, an algorithm with a non-linear data structure and a large amount of data organization. We find that designing performant Numba code for complex algorithms can be as challenging as writing in a compiled language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Parallel Computing and Optimization Techniques
