Parallel Tensor Compression for Large-Scale Scientific Data

Woody Austin; Grey Ballard; Tamara G. Kolda

arXiv:1510.06689·cs.NA·January 5, 2017

Parallel Tensor Compression for Large-Scale Scientific Data

Woody Austin, Grey Ballard, Tamara G. Kolda

PDF

1 Repo

TL;DR

This paper introduces a distributed-memory parallel Tucker decomposition method for compressing massive scientific data represented as tensors, achieving high compression ratios with negligible accuracy loss on real-world datasets.

Contribution

It presents the first distributed-memory parallel implementation of Tucker decomposition tailored for large-scale scientific data, with optimized data distribution avoiding data redistribution.

Findings

01

Achieves compression ratios up to 5000 with minimal accuracy loss.

02

Demonstrates scalable parallel performance on real-world datasets.

03

Provides analysis of computation and communication costs.

Abstract

As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8~TB of data, assuming double precision. By viewing the data as a dense five-way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 5000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed-memory parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.com/tensors/tuckermpi
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.