# Fast post-hoc method for updating moments of large datasets

**Authors:** Benjamin J. Q. Woods

arXiv: 1812.09372 · 2018-12-27

## TL;DR

This paper introduces a fast, memory-efficient post-hoc method for updating statistical moments of large datasets by only using previous moments and new data, significantly reducing storage and computation time.

## Contribution

The proposed method allows updating dataset moments without recalculating from scratch, improving efficiency for large datasets and low-order moments.

## Key findings

- Reduces data storage needs for moment updates
- Speeds up computation for large datasets
- Effective for low-order moments (n ≤ 10)

## Abstract

Moments of large datasets utilise the mean of the dataset; consequently, updating the dataset traditionally requires one to update the mean, which then requires one to recalculate the moment. This means that metrics such as the standard deviation, $R^2$ correlation, and other statistics have to be `refreshed' for dataset updates, requiring large data storage and taking long times to process. Here, a method is shown for updating moments that only requires the previous moments (which are computationally cheaper to store), and the new data to be appended. This leads to a dramatic decrease in data storage requirements, and significant computational speed-up for large datasets or low-order moments (n $\lesssim$ 10).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.09372/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1812.09372/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/1812.09372/full.md

---
Source: https://tomesphere.com/paper/1812.09372