FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph   Neural Networks

Md. Khaledur Rahman; Majedul Haque Sujon; Ariful Azad

arXiv:2011.06391·cs.LG·October 28, 2021

FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks

Md. Khaledur Rahman, Majedul Haque Sujon, Ariful Azad

PDF

1 Repo

TL;DR

FusedMM is a unified, high-performance kernel that accelerates graph embedding and GNN computations by combining sampled dense-dense and sparse-dense matrix multiplications, achieving significant speedups across various processors.

Contribution

The paper introduces FusedMM, a novel unified kernel that efficiently combines different matrix multiplication patterns for graph algorithms, outperforming existing solutions.

Findings

01

FusedMM is up to 28x faster than existing kernels.

02

It performs well on Intel, AMD, and ARM processors.

03

FusedMM accelerates end-to-end graph embedding algorithms.

Abstract

We develop a fused matrix multiplication kernel that unifies sampled dense-dense matrix multiplication and sparse-dense matrix multiplication under a single operation called FusedMM. By using user-defined functions, FusedMM can capture almost all computational patterns needed by popular graph embedding and GNN approaches. FusedMM is an order of magnitude faster than its equivalent kernels in Deep Graph Library. The superior performance of FusedMM comes from the low-level vectorized kernels, a suitable load balancing scheme and an efficient utilization of the memory bandwidth. FusedMM can tune its performance using a code generator and perform equally well on Intel, AMD and ARM processors. FusedMM speeds up an end-to-end graph embedding algorithm by up to 28x on different processors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HipGraph/FusedMM
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.