Attention Based Machine Learning Methods for Data Reduction with   Guaranteed Error Bounds

Xiao Li; Jaemoon Lee; Anand Rangarajan; Sanjay Ranka

arXiv:2409.05357·cs.LG·September 10, 2024

Attention Based Machine Learning Methods for Data Reduction with Guaranteed Error Bounds

Xiao Li, Jaemoon Lee, Anand Rangarajan, Sanjay Ranka

PDF

Open Access

TL;DR

This paper introduces an attention-based hierarchical data compression method for scientific datasets that leverages spatial, temporal, and inter-variable correlations, achieving significantly higher compression ratios with guaranteed error bounds.

Contribution

It presents a novel attention-based autoencoder framework with error-bound guarantees, outperforming existing methods like SZ3 in scientific data reduction.

Findings

01

Up to 8x higher compression ratio on multi-variable datasets.

02

Achieves 3x and 2x higher compression ratios on single-variable datasets.

03

Effectively captures spatiotemporal and inter-variable correlations.

Abstract

Scientific applications in fields such as high energy physics, computational fluid dynamics, and climate science generate vast amounts of data at high velocities. This exponential growth in data production is surpassing the advancements in computing power, network capabilities, and storage capacities. To address this challenge, data compression or reduction techniques are crucial. These scientific datasets have underlying data structures that consist of structured and block structured multidimensional meshes where each grid point corresponds to a tensor. It is important that data reduction techniques leverage strong spatial and temporal correlations that are ubiquitous in these applications. Additionally, applications such as CFD, process tensors comprising hundred plus species and their attributes at each grid point. Reduction techniques should be able to leverage interrelationships…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications