A concentration theorem for projections
Sanjoy Dasgupta, Daniel Hsu, Nakul Verma

TL;DR
This paper proves a concentration theorem showing that most linear projections of high-dimensional zero-mean data resemble mixtures of spherical Gaussians, with the resemblance depending on the projection dimension and data eccentricity.
Contribution
It introduces a precise concentration theorem for projections, revealing the Gaussian mixture structure of projected high-dimensional data.
Findings
Most projections resemble Gaussian mixtures with specific variances
The effect depends on the ratio of projection dimension to original dimension
Experimental validation confirms the theoretical results
Abstract
X in R^D has mean zero and finite second moments. We show that there is a precise sense in which almost all linear projections of X into R^d (for d < D) look like a scale-mixture of spherical Gaussians -- specifically, a mixture of distributions N(0, sigma^2 I_d) where the weight of the particular sigma component is P (| X |^2 = sigma^2 D). The extent of this effect depends upon the ratio of d to D, and upon a particular coefficient of eccentricity of X's distribution. We explore this result in a variety of experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Markov Chains and Monte Carlo Methods · Statistical Mechanics and Entropy
