High-dimensional Asymptotics of Denoising Autoencoders
Hugo Cui, Lenka Zdeborov\'a

TL;DR
This paper analyzes the denoising performance of a two-layer non-linear autoencoder with skip connections in high-dimensional settings, providing explicit formulas and insights into its advantages over PCA-like models.
Contribution
It offers the first high-dimensional asymptotic analysis of denoising autoencoders with skip connections, including closed-form error expressions and empirical validation.
Findings
Closed-form expressions for denoising error in high dimensions
Quantitative comparison showing advantage over PCA-based autoencoders
Empirical results matching theoretical predictions on real datasets
Abstract
We address the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection. We consider the high-dimensional limit where the number of training samples and the input dimension jointly tend to infinity while the number of hidden units remains bounded. We provide closed-form expressions for the denoising mean-squared test error. Building on this result, we quantitatively characterize the advantage of the considered architecture over the autoencoder without the skip connection that relates closely to principal component analysis. We further show that our results accurately capture the learning curves on a range of real data sets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsImage and Signal Denoising Methods · Bayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference
