# Randomized algorithms for low-rank tensor decompositions in the Tucker   format

**Authors:** Rachel Minster, Arvind K. Saibaba, Misha E. Kilmer

arXiv: 1905.07311 · 2019-05-20

## TL;DR

This paper develops and analyzes randomized algorithms for low-rank tensor decompositions in the Tucker format, enabling efficient compression of large-scale datasets with applications in image and text data.

## Contribution

It introduces randomized variants of HOSVD and STHOSVD algorithms with probabilistic error analysis and adaptive, structure-preserving features for large-scale tensor data.

## Key findings

- Randomized algorithms achieve accurate low-rank tensor approximations.
- Adaptive method finds low-rank representations without prior rank knowledge.
- Structure-preserving variant handles large sparse tensors effectively.

## Abstract

Many applications in data science and scientific computing involve large-scale datasets that are expensive to store and compute with, but can be efficiently compressed and stored in an appropriate tensor format. In recent years, randomized matrix methods have been used to efficiently and accurately compute low-rank matrix decompositions. Motivated by this success, we focus on developing randomized algorithms for tensor decompositions in the Tucker representation. Specifically, we present randomized versions of two well-known compression algorithms, namely, HOSVD and STHOSVD. We present a detailed probabilistic analysis of the error of the randomized tensor algorithms. We also develop variants of these algorithms that tackle specific challenges posed by large-scale datasets. The first variant adaptively finds a low-rank representation satisfying a given tolerance and it is beneficial when the target-rank is not known in advance. The second variant preserves the structure of the original tensor, and is beneficial for large sparse tensors that are difficult to load in memory. We consider several different datasets for our numerical experiments: synthetic test tensors and realistic applications such as the compression of facial image samples in the Olivetti database and word counts in the Enron email dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.07311/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1905.07311/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/1905.07311/full.md

---
Source: https://tomesphere.com/paper/1905.07311