Experimental Evaluation of Multi-Round Matrix Multiplication on MapReduce
Matteo Ceccarello, Francesco Silvestri

TL;DR
This paper investigates the performance of multi-round MapReduce algorithms for matrix multiplication, demonstrating that multi-round approaches can be efficient and advantageous in cloud environments through an extensive experimental evaluation.
Contribution
It introduces the M3 library for scalable matrix multiplication and provides an empirical study comparing monolithic and multi-round approaches in cloud settings.
Findings
Multi-round algorithms can have small overheads compared to monolithic ones.
The M3 library effectively supports dense and sparse matrix multiplication.
Multi-round approaches exploit cloud features for performance benefits.
Abstract
A common approach in the design of MapReduce algorithms is to minimize the number of rounds. Indeed, there are many examples in the literature of monolithic MapReduce algorithms, which are algorithms requiring just one or two rounds. However, we claim that the design of monolithic algorithms may not be the best approach in cloud systems. Indeed, multi-round algorithms may exploit some features of cloud platforms by suitably setting the round number according to the execution context. In this paper we carry out an experimental study of multi-round MapReduce algorithms aiming at investigating the performance of the multi-round approach. We use matrix multiplication as a case study. We first propose a scalable Hadoop library, named M, for matrix multiplication in the dense and sparse cases which allows to tradeoff round number with the amount of data shuffled in each round and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Stochastic Gradient Optimization Techniques · Parallel Computing and Optimization Techniques
