Combinatorial BLAS 2.0: Scaling combinatorial algorithms on distributed-memory systems
Ariful Azad, Oguz Selvitopi, Md Taufique Hussain, John R. Gilbert,, Aydin Buluc

TL;DR
Combinatorial BLAS 2.0 enhances distributed-memory combinatorial algorithms with features like communication avoidance, GPU support, and scalable I/O, enabling efficient parallel processing for graph analysis and related fields.
Contribution
This paper introduces key technical advancements in Combinatorial BLAS 2.0, including communication avoidance, hierarchical parallelism, GPU acceleration, and scalable I/O for improved distributed combinatorial computations.
Findings
Enhanced scalability and performance in distributed combinatorial algorithms
Effective use of GPU kernels and multithreading for acceleration
Guidelines for selecting data structures and functions in various scenarios
Abstract
Combinatorial algorithms such as those that arise in graph analysis, modeling of discrete systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial BLAS library implements key computational primitives for rapid development of combinatorial algorithms in distributed-memory systems. During the decade since its first introduction, the Combinatorial BLAS library has evolved and expanded significantly. This paper details many of the key technical features of Combinatorial BLAS version 2.0, such as communication avoidance, hierarchical parallelism via in-node multithreading, accelerator support via GPU kernels, generalized semiring support, implementations of key data structures and functions, and scalable distributed I/O operations for human-readable files. Our paper also presents several rules of thumb for choosing the right data structures and functions in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Distributed systems and fault tolerance · Interconnection Networks and Systems
