Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU
Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna

TL;DR
This paper presents Dynasor, a novel dynamic memory layout and algorithm for sparse tensor decomposition that significantly accelerates spMTTKRP computations on multi-core CPUs by optimizing data locality, load balancing, and communication overhead.
Contribution
It introduces a new algorithm and tensor format that improve execution speed and efficiency of sparse tensor decomposition on multi-core CPUs.
Findings
Achieves 2.12x to 9.01x speedup over state-of-the-art implementations.
Effectively exploits data locality and load balancing in sparse tensor computations.
Reduces communication overhead through dynamic tensor remapping.
Abstract
Sparse Matricized Tensor Times Khatri-Rao Product (spMTTKRP) is the most time-consuming compute kernel in sparse tensor decomposition. In this paper, we introduce a novel algorithm to minimize the execution time of spMTTKRP across all modes of an input tensor on multi-core CPU platform. The proposed algorithm leverages the FLYCOO tensor format to exploit data locality in external memory accesses. It effectively utilizes computational resources by enabling lock-free concurrent processing of independent partitions of the input tensor. The proposed partitioning ensures load balancing among CPU threads. Our dynamic tensor remapping technique leads to reduced communication overhead along all the modes. On widely used real-world tensors, our work achieves 2.12x - 9.01x speedup in total execution time across all modes compared with the state-of-the-art CPU implementations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Parallel Computing and Optimization Techniques · Computational Physics and Python Applications
