TL;DR
This paper introduces RACE, a recursive algebraic coloring algorithm that significantly improves hardware efficiency and scalability for symmetric sparse matrix-vector multiplication on multicore systems.
Contribution
The paper presents RACE, a novel coloring algorithm and open-source library that overcomes limitations of existing methods in load balancing and memory hierarchy utilization.
Findings
RACE outperforms state-of-the-art coloring techniques and Intel MKL on multiple matrices.
RACE scales efficiently on multicore processors, aligning with the Roofline model.
The approach is applicable to various sparse matrix operations with data dependencies.
Abstract
The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building block for many numerical linear algebra kernel operations or graph traversal applications. Parallelizing SymmSpMV on today's multicore platforms with up to 100 cores is difficult due to the need to manage conflicting updates on the result vector. Coloring approaches can be used to solve this problem without data duplication, but existing coloring algorithms do not take load balancing and deep memory hierarchies into account, hampering scalability and full-chip performance. In this work, we propose the recursive algebraic coloring engine (RACE), a novel coloring algorithm and open-source library implementation, which eliminates the shortcomings of previous coloring methods in terms of hardware efficiency and parallelization overhead. We describe the level construction, distance-k coloring, and load…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
