SWIFT: Scalable Wasserstein Factorization for Sparse Nonnegative Tensors
Ardavan Afshar, Kejing Yin, Sherry Yan, Cheng Qian, Joyce C. Ho,, Haesun Park, Jimeng Sun

TL;DR
SWIFT introduces a distribution-agnostic tensor factorization method based on Wasserstein distance, effectively capturing complex correlations and improving predictive performance on sparse nonnegative tensors.
Contribution
It formulates tensor factorization as an optimal transport problem using Wasserstein distance, enabling scalable, distribution-free modeling that leverages correlation structures.
Findings
Achieves up to 11.31% improvement in prediction accuracy.
Maintains scalability comparable to existing CP algorithms.
Outperforms baselines under noisy conditions with up to 17% improvement.
Abstract
Existing tensor factorization methods assume that the input tensor follows some specific distribution (i.e. Poisson, Bernoulli, and Gaussian), and solve the factorization by minimizing some empirical loss functions defined based on the corresponding distribution. However, it suffers from several drawbacks: 1) In reality, the underlying distributions are complicated and unknown, making it infeasible to be approximated by a simple distribution. 2) The correlation across dimensions of the input tensor is not well utilized, leading to sub-optimal performance. Although heuristics were proposed to incorporate such correlation as side information under Gaussian distribution, they can not easily be generalized to other distributions. Thus, a more principled way of utilizing the correlation in tensor factorization models is still an open challenge. Without assuming any explicit distribution, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTensor decomposition and applications
