Rosko: Row Skipping Outer Products for Sparse Matrix Multiplication   Kernels

Vikas Natesh; Andrew Sabot; H.T. Kung; Mark Ting

arXiv:2307.03930·cs.LG·July 11, 2023

Rosko: Row Skipping Outer Products for Sparse Matrix Multiplication Kernels

Vikas Natesh, Andrew Sabot, H.T. Kung, Mark Ting

PDF

Open Access 1 Repo

TL;DR

Rosko introduces a novel row skipping outer product technique that significantly reduces computation and memory access in sparse matrix multiplication for neural networks, outperforming existing solutions.

Contribution

The paper presents Rosko, a new sparse matrix multiplication kernel that efficiently skips entire rows, reducing computation and memory use without auto-tuning.

Findings

01

Achieves up to 6.5x runtime reduction on CPUs

02

Effectively handles sparsities from 65% to 99.8%

03

Outperforms existing auto-tuning and vendor-optimized libraries

Abstract

We propose Rosko -- row skipping outer products -- for deriving sparse matrix multiplication (SpMM) kernels in reducing computation and memory access requirements of deep neural networks (DNNs). Rosko allows skipping of entire row computations during program execution with low sparsity-management overheads. We analytically derive sparse CPU kernels that adapt to given hardware characteristics to effectively utilize processor cores and minimize data movement without the need for auto-tuning or search space exploration. Rosko can be integrated with other outer product scheduling methods, allowing them to leverage row skipping by using Rosko's packing format to skip unnecessary computation. Rosko kernels outperform existing auto-tuning and search-based solutions as well as state-of-the-art vendor-optimized libraries on real hardware across a variety of neural network workloads. For…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vnatesh/rosko
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications