# A Unified Optimization Approach for Sparse Tensor Operations on GPUs

**Authors:** Bangtian Liu, Chengyao Wen, Anand D.Sarwate, Maryam Mehri Dehnavi

arXiv: 1705.09905 · 2017-12-18

## TL;DR

This paper introduces F-COO, a unified tensor representation, and GPU-specific optimizations that significantly improve the performance of sparse tensor operations like SpTTM and SpMTTKRP on GPUs, enabling faster tensor decompositions.

## Contribution

The paper presents a novel unified tensor representation called F-COO and GPU-specific optimizations for efficient sparse tensor computations, outperforming existing methods.

## Key findings

- Up to 3.7x speedup for SpTTM on GPUs.
- Up to 30.6x speedup for SpMTTKRP on GPUs.
- Up to 14.9x acceleration in tensor decomposition tasks.

## Abstract

Sparse tensors appear in many large-scale applications with multidimensional and sparse data. While multidimensional sparse data often need to be processed on manycore processors, attempts to develop highly-optimized GPU-based implementations of sparse tensor operations are rare. The irregular computation patterns and sparsity structures as well as the large memory footprints of sparse tensor operations make such implementations challenging. We leverage the fact that sparse tensor operations share similar computation patterns to propose a unified tensor representation called F-COO. Combined with GPU-specific optimizations, F-COO provides highly-optimized implementations of sparse tensor computations on GPUs. The performance of the proposed unified approach is demonstrated for tensor-based kernels such as the Sparse Matricized Tensor- Times-Khatri-Rao Product (SpMTTKRP) and the Sparse Tensor- Times-Matrix Multiply (SpTTM) and is used in tensor decomposition algorithms. Compared to state-of-the-art work we improve the performance of SpTTM and SpMTTKRP up to 3.7 and 30.6 times respectively on NVIDIA Titan-X GPUs. We implement a CANDECOMP/PARAFAC (CP) decomposition and achieve up to 14.9 times speedup using the unified method over state-of-the-art libraries on NVIDIA Titan-X GPUs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.09905/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1705.09905/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1705.09905/full.md

---
Source: https://tomesphere.com/paper/1705.09905