SMASH: Sparse Matrix Atomic Scratchpad Hashing

Kaustubh Shivdikar

arXiv:2105.14156·cs.DC·June 1, 2021

SMASH: Sparse Matrix Atomic Scratchpad Hashing

Kaustubh Shivdikar

PDF

TL;DR

This paper introduces a novel row-wise product SpGEMM kernel using atomic instructions to efficiently handle sparse matrix multiplication, achieving significant speedups on the PIUMA accelerator.

Contribution

A new row-wise product SpGEMM kernel leveraging atomic instructions is proposed, reducing memory overhead and improving performance on specialized hardware.

Findings

01

Achieves 9.4x speedup over prior approaches

02

Effectively reduces redundant memory fetches

03

Optimized for the PIUMA accelerator architecture

Abstract

Sparse matrices, more specifically SpGEMM kernels, are commonly found in a wide range of applications, spanning graph-based path-finding to machine learning algorithms (e.g., neural networks). A particular challenge in implementing SpGEMM kernels has been the pressure placed on DRAM memory. One approach to tackle this problem is to use an inner product method for the SpGEMM kernel implementation. While the inner product produces fewer intermediate results, it can end up saturating the memory bandwidth, given the high number of redundant fetches of the input matrix elements. Using an outer product-based SpGEMM kernel can reduce redundant fetches, but at the cost of increased overhead due to extra computation and memory accesses for producing/managing partial products. In this thesis, we introduce a novel SpGEMM kernel implementation based on the row-wise product approach. We leverage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.