Thinning a Wishart Random Matrix

Ameer Dharamshi; Anna Neufeld; Lucy L. Gao; Daniela Witten; and Jacob Bien

arXiv:2502.09957·stat.ME·December 16, 2025

Thinning a Wishart Random Matrix

Ameer Dharamshi, Anna Neufeld, Lucy L. Gao, Daniela Witten, and Jacob Bien

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to generate independent data matrices from only the sample mean and Wishart-distributed sample covariance, enabling data thinning without direct access to raw data.

Contribution

It provides the first thinning strategy for Wishart-distributed covariance matrices, allowing independent data generation from summary statistics.

Findings

01

Independent data matrices can be generated from sample mean and covariance.

02

The method preserves the original sample mean and covariance when recombined.

03

Enables privacy-preserving data analysis and validation without raw data access.

Abstract

Recent work has explored data thinning, a generalization of sample splitting that involves decomposing a (possibly matrix-valued) random variable into independent components. In the special case of a $n \times p$ random matrix with independent and identically distributed $N_{p} (μ, Σ)$ rows, Dharamshi et al. (2024a) provides a comprehensive analysis of the settings in which thinning is or is not possible: briefly, if $Σ$ is unknown, then one can thin provided that $n > 1$ . However, in some situations a data analyst may not have direct access to the data itself. For example, to preserve individuals' privacy, a data bank may provide only summary statistics such as the sample mean and sample covariance matrix. While the sample mean follows a Gaussian distribution, the sample covariance follows (up to scaling) a Wishart distribution, for which no thinning strategies have yet been…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AmeerD/Wishart
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Mechanics and Entropy · Rough Sets and Fuzzy Logic