On the Regularization of Autoencoders

Harald Steck; Dario Garcia Garcia

arXiv:2110.11402·cs.LG·October 25, 2021

On the Regularization of Autoencoders

Harald Steck, Dario Garcia Garcia

PDF

Open Access

TL;DR

This paper investigates how unsupervised autoencoders inherently regularize their models, showing they cannot outperform linear autoencoders of the same size, and provides a closed-form approximation for a constrained low-rank autoencoder model.

Contribution

It extends recent linear model results to nonlinear and constrained autoencoders, revealing inherent regularization effects and deriving an approximation for the EDLAE model's optimal solution.

Findings

01

Unsupervised autoencoders induce strong regularization, limiting their capacity.

02

Deep nonlinear autoencoders cannot outperform linear autoencoders with the same last hidden layer size.

03

The derived approximation accurately predicts the EDLAE model's optimal solution across datasets.

Abstract

While much work has been devoted to understanding the implicit (and explicit) regularization of deep nonlinear networks in the supervised setting, this paper focuses on unsupervised learning, i.e., autoencoders are trained with the objective of reproducing the output from the input. We extend recent results [Jin et al. 2021] on unconstrained linear models and apply them to (1) nonlinear autoencoders and (2) constrained linear autoencoders, obtaining the following two results: first, we show that the unsupervised setting by itself induces strong additional regularization, i.e., a severe reduction in the model-capacity of the learned autoencoder: we derive that a deep nonlinear autoencoder cannot fit the training data more accurately than a linear autoencoder does if both models have the same dimensionality in their last hidden layer (and under a few additional assumptions). Our second…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Tensor decomposition and applications