TaskTorrent: a Lightweight Distributed Task-Based Runtime System in C++
L\'eopold Cambier, Yizhou Qian, Eric Darve

TL;DR
TaskTorrent is a lightweight, distributed C++14 runtime system that efficiently manages task graphs using MPI, enabling scalable execution of large linear algebra problems on thousands of cores.
Contribution
It introduces a novel distributed task-based runtime with a fully distributed task graph discovery mechanism in C++14 using MPI.
Findings
Minimal overhead compared to existing solutions
Excellent scalability to thousands of cores
Effective application to large linear algebra problems
Abstract
We present TaskTorrent, a lightweight distributed task-based runtime in C++. TaskTorrent uses a parametrized task graph to express the task DAG, and one-sided active messages to trigger remote tasks asynchronously. As a result the task DAG is completely distributed and discovered in parallel. It is a C++14 library and only depends on MPI. We explain the API and the implementation. We perform a series of benchmarks against StarPU and ScaLAPACK. Micro benchmarks show it has a minimal overhead compared to other solutions. We then apply it to two large linear algebra problems. TaskTorrent scales very well to thousands of cores, exhibiting good weak and strong scalings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
