A Framework for Practical Parallel Fast Matrix Multiplication

Austin R. Benson; Grey Ballard

arXiv:1409.2908·cs.DC·January 8, 2018

A Framework for Practical Parallel Fast Matrix Multiplication

Austin R. Benson, Grey Ballard

PDF

1 Repo

TL;DR

This paper introduces a practical framework for implementing and benchmarking various fast matrix multiplication algorithms, demonstrating their performance advantages over classical methods on different problem sizes and shapes.

Contribution

It develops a code generation tool for automatic implementation of multiple fast algorithms, including novel parallel schemes, and analyzes their practical performance and implementation issues.

Findings

01

Fast algorithms outperform classical methods on modest problem sizes.

02

Algorithm efficiency depends on matrix size and shape.

03

Practical implementation issues are identified for shared-memory machines.

Abstract

Matrix multiplication is a fundamental computation in many scientific disciplines. In this paper, we show that novel fast matrix multiplication algorithms can significantly outperform vendor implementations of the classical algorithm and Strassen's fast algorithm on modest problem sizes and shapes. Furthermore, we show that the best choice of fast algorithm depends not only on the size of the matrices but also the shape. We develop a code generation tool to automatically implement multiple sequential and shared-memory parallel variants of each fast algorithm, including our novel parallelization scheme. This allows us to rapidly benchmark over 20 fast algorithms on several problem sizes. Furthermore, we discuss a number of practical implementation issues for these algorithms on shared-memory machines that can direct further research on making fast algorithms practical.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arbenson/fast-matmul
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.