# Multivariate Pointwise Information-Driven Data Sampling and   Visualization

**Authors:** Soumya Dutta, Ayan Biswas, James Ahrens

arXiv: 1907.11762 · 2019-07-30

## TL;DR

This paper introduces a novel multivariate data sampling method based on pointwise information theory, enabling effective data reduction while preserving complex relationships among multiple variables for scientific data analysis.

## Contribution

The work presents a new sub-sampling algorithm that leverages pointwise information measures to maintain multivariate associations in reduced datasets, improving scientific data analysis.

## Key findings

- Effective preservation of multivariate relationships in sampled data
- Improved accuracy in multivariate feature queries
- Validated on several scientific datasets

## Abstract

With increasing computing capabilities of modern supercomputers, the size of the data generated from the scientific simulations is growing rapidly. As a result, application scientists need effective data summarization techniques that can reduce large-scale multivariate spatiotemporal data sets while preserving the important data properties so that the reduced data can answer domain-specific queries involving multiple variables with sufficient accuracy. While analyzing complex scientific events, domain experts often analyze and visualize two or more variables together to obtain a better understanding of the characteristics of the data features. Therefore, data summarization techniques are required to analyze multi-variable relationships in detail and then perform data reduction such that the important features involving multiple variables are preserved in the reduced data. To achieve this, in this work, we propose a data sub-sampling algorithm for performing statistical data summarization that leverages pointwise information theoretic measures to quantify the statistical association of data points considering multiple variables and generates a sub-sampled data that preserves the statistical association among multi-variables. Using such reduced sampled data, we show that multivariate feature query and analysis can be done effectively. The efficacy of the proposed multivariate association driven sampling algorithm is presented by applying it on several scientific data sets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.11762/full.md

## Figures

56 figures with captions in the complete paper: https://tomesphere.com/paper/1907.11762/full.md

## References

73 references — full list in the complete paper: https://tomesphere.com/paper/1907.11762/full.md

---
Source: https://tomesphere.com/paper/1907.11762