Pitfalls of Gaussians as a noise distribution in NCE
Holden Lee, Chirag Pabbaraju, Anish Sevekari, Andrej Risteski

TL;DR
This paper demonstrates that using Gaussian noise distributions in Noise Contrastive Estimation can lead to poor conditioning and inefficiency, highlighting the need for more complex noise choices for better performance.
Contribution
The paper reveals the pitfalls of Gaussian noise in NCE, showing it causes exponential Hessian conditioning issues, and emphasizes the importance of more complex noise distributions.
Findings
Gaussian noise causes exponential Hessian conditioning problems
Poor conditioning impacts statistical and computational efficiency
More complex noise distributions are necessary for effective NCE
Abstract
Noise Contrastive Estimation (NCE) is a popular approach for learning probability density functions parameterized up to a constant of proportionality. The main idea is to design a classification problem for distinguishing training data from samples from an easy-to-sample noise distribution , in a manner that avoids having to calculate a partition function. It is well-known that the choice of can severely impact the computational and statistical efficiency of NCE. In practice, a common choice for is a Gaussian which matches the mean and covariance of the data. In this paper, we show that such a choice can result in an exponentially bad (in the ambient dimension) conditioning of the Hessian of the loss, even for very simple data distributions. As a consequence, both the statistical and algorithmic complexity for such a choice of will be problematic in practice, suggesting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Gaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models
