# Distributed Matrix Factorization using Asynchrounous Communication

**Authors:** Tom Vander Aa, Imen Chakroun, Tom Haber

arXiv: 1705.10633 · 2017-05-31

## TL;DR

This paper presents a distributed parallel implementation of Bayesian Probabilistic Matrix Factorization using Gibbs sampling, achieving improved performance on large-scale data through asynchronous communication and load balancing techniques.

## Contribution

It introduces a novel distributed BPMF algorithm with asynchronous communication and load balancing, outperforming existing implementations on large datasets.

## Key findings

- Achieved faster convergence with asynchronous communication.
- Improved load balancing with work stealing on a single node.
- Outperformed state-of-the-art implementations in experiments.

## Abstract

Using the matrix factorization technique in machine learning is very common mainly in areas like recommender systems. Despite its high prediction accuracy and its ability to avoid over-fitting of the data, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used on large scale data because of the prohibitive cost. In this paper, we propose a distributed high-performance parallel implementation of the BPMF using Gibbs sampling on shared and distributed architectures. We show by using efficient load balancing using work stealing on a single node, and by using asynchronous communication in the distributed version we beat state of the art implementations.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.10633/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1705.10633/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1705.10633/full.md

---
Source: https://tomesphere.com/paper/1705.10633