Distributed Methods for High-dimensional and Large-scale Tensor Factorization
Kijung Shin, U. Kang

TL;DR
This paper introduces two scalable distributed tensor factorization methods, SALS and CDTF, capable of decomposing large high-dimensional tensors efficiently on commodity hardware, outperforming existing methods in speed and memory usage.
Contribution
The paper presents novel distributed tensor factorization algorithms, SALS and CDTF, that are scalable, memory-efficient, and capable of handling extremely large tensors.
Findings
Successfully factorized a 5-dimensional tensor with 1 billion entries.
Our methods require significantly less memory than existing approaches.
Achieved up to 98.2X speedup with MapReduce optimizations.
Abstract
Given a high-dimensional large-scale tensor, how can we decompose it into latent factors? Can we process it on commodity computers with limited memory? These questions are closely related to recommender systems, which have modeled rating data not as a matrix but as a tensor to utilize contextual information such as time and location. This increase in the dimension requires tensor factorization methods scalable with both the dimension and size of a tensor. In this paper, we propose two distributed tensor factorization methods, SALS and CDTF. Both methods are scalable with all aspects of data, and they show an interesting trade-off between convergence speed and memory requirements. SALS updates a subset of the columns of a factor matrix at a time, and CDTF, a special case of SALS, updates one column at a time. In our experiments, only our methods factorize a 5-dimensional tensor with 1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
