JAMPI: efficient matrix multiplication in Spark using Barrier Execution   Mode

Tamas Foldi; Chris von Csefalvay; Nicolas A. Perez

arXiv:2007.01811·cs.DC·July 21, 2020

JAMPI: efficient matrix multiplication in Spark using Barrier Execution Mode

Tamas Foldi, Chris von Csefalvay, Nicolas A. Perez

PDF

1 Repo

TL;DR

This paper introduces a new barrier mode in Apache Spark that enables efficient distributed matrix multiplication using Cannon's algorithm, significantly improving performance and reducing memory usage for large matrices, with applications in deep learning.

Contribution

It presents a novel integration of Cannon's algorithm into Spark's barrier execution mode, enhancing distributed matrix multiplication performance.

Findings

01

Up to 24% performance improvement on 10,000x10,000 matrices

02

Significantly lower memory footprint compared to existing implementations

03

Enables faster deep learning training workflows

Abstract

The new barrier mode in Apache Spark allows embedding distributed deep learning training as a Spark stage to simplify the distributed training workflow. In Spark, a task in a stage does not depend on any other tasks in the same stage, and hence it can be scheduled independently. However, several algorithms require more sophisticated inter-task communications, similar to the MPI paradigm. By combining distributed message passing (using asynchronous network IO), OpenJDK's new auto-vectorization and Spark's barrier execution mode, we can add non-map/reduce based algorithms, such as Cannon's distributed matrix multiplication to Spark. We document an efficient distributed matrix multiplication using Cannon's algorithm, which improves significantly on the performance of the existing MLlib implementation. Used within a barrier task, the algorithm described herein results in an up to 24 percent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

starschema/jampi-spark-dotmatrix
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.