A concentration theorem for projections

Sanjoy Dasgupta; Daniel Hsu; Nakul Verma

arXiv:1206.6813·cs.LG·July 2, 2012·2 cites

A concentration theorem for projections

Sanjoy Dasgupta, Daniel Hsu, Nakul Verma

PDF

Open Access

TL;DR

This paper proves a concentration theorem showing that most linear projections of high-dimensional zero-mean data resemble mixtures of spherical Gaussians, with the resemblance depending on the projection dimension and data eccentricity.

Contribution

It introduces a precise concentration theorem for projections, revealing the Gaussian mixture structure of projected high-dimensional data.

Findings

01

Most projections resemble Gaussian mixtures with specific variances

02

The effect depends on the ratio of projection dimension to original dimension

03

Experimental validation confirms the theoretical results

Abstract

X in R^D has mean zero and finite second moments. We show that there is a precise sense in which almost all linear projections of X into R^d (for d < D) look like a scale-mixture of spherical Gaussians -- specifically, a mixture of distributions N(0, sigma^2 I_d) where the weight of the particular sigma component is P (| X |^2 = sigma^2 D). The extent of this effect depends upon the ratio of d to D, and upon a particular coefficient of eccentricity of X's distribution. We explore this result in a variety of experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Markov Chains and Monte Carlo Methods · Statistical Mechanics and Entropy