Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks
Shih-Yu Sun, Vimal Thilak, Etai Littwin, Omid Saremi, Joshua M., Susskind

TL;DR
This paper investigates how deep linear autoencoders implicitly learn low-rank representations through gradient descent, proposing methods to stabilize training and achieve optimal latent ranks for various tasks.
Contribution
It introduces a novel analysis of implicit rank regularization in autoencoders and proposes orthogonal initialization and learning rate strategies to improve training stability.
Findings
Linear autoencoders converge to true latent rank on synthetic data.
Nonlinear autoencoders find latent ranks optimal for downstream tasks.
Proposed methods improve training stability and rank convergence.
Abstract
Deep linear networks trained with gradient descent yield low rank solutions, as is typically studied in matrix factorization. In this paper, we take a step further and analyze implicit rank regularization in autoencoders. We show greedy learning of low-rank latent codes induced by a linear sub-network at the autoencoder bottleneck. We further propose orthogonal initialization and principled learning rate adjustment to mitigate sensitivity of training dynamics to spectral prior and linear depth. With linear autoencoders on synthetic data, our method converges stably to ground-truth latent code rank. With nonlinear autoencoders, our method converges to latent ranks optimal for downstream classification and image sampling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
