The Breakdown of Gaussian Universality in Classification of High-dimensional Linear Factor Mixtures
Xiaoyi Mai, Zhenyu Liao

TL;DR
This paper demonstrates that Gaussian universality does not hold in high-dimensional classification of linear factor mixture data, showing that data distribution details influence learning performance beyond mean and covariance.
Contribution
It provides a high-dimensional analysis revealing the breakdown of Gaussian universality in mixture data classification, extending understanding beyond Gaussian assumptions.
Findings
Gaussian universality breaks down in linear factor mixture models
Data distribution influences asymptotic learning performance beyond mean and covariance
Conditions for Gaussian universality are specified and discussed
Abstract
The assumption of Gaussian or Gaussian mixture data has been extensively exploited in a long series of precise performance analyses of machine learning (ML) methods, on large datasets having comparably numerous samples and features. To relax this restrictive assumption, subsequent efforts have been devoted to establish "Gaussian equivalent principles" by studying scenarios of Gaussian universality where the asymptotic performance of ML methods on non-Gaussian data remains unchanged when replaced with Gaussian data having the same mean and covariance. Beyond the realm of Gaussian universality, there are few exact results on how the data distribution affects the learning performance. In this article, we provide a precise high-dimensional characterization of empirical risk minimization, for classification under a general mixture data setting of linear factor models that extends Gaussian…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models
