Sparse Coding and Autoencoders
Akshay Rangamani, Anirbit Mukherjee, Amitabh Basu, Tejaswini, Ganapathy, Ashish Arora, Sang Chin, Trac D. Tran

TL;DR
This paper rigorously analyzes whether gradient descent on a simple autoencoder can solve the dictionary learning problem, showing that under mild assumptions, the autoencoder's training landscape is favorable for recovering the underlying dictionary.
Contribution
It proves that gradient descent on an autoencoder's squared loss can effectively recover the dictionary in sparse coding, and demonstrates support recovery via ReLU layers independently of the loss.
Findings
Expected gradient norm is negligible near the true dictionary.
Experimental evidence supports local optimality of the true dictionary.
ReLU layers can recover sparse code support regardless of the loss function.
Abstract
In "Dictionary Learning" one tries to recover incoherent matrices (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors with a small support of size for some while having access to observations where . In this work we undertake a rigorous analysis of whether gradient descent on the squared loss of an autoencoder can solve the dictionary learning problem. The "Autoencoder" architecture we consider is a mapping with a single ReLU activation layer of size . Under very mild distributional assumptions on , we prove that the norm of the expected gradient of the standard squared loss function is asymptotically (in sparse code dimension) negligible for all points in a small neighborhood of . This is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSolana Customer Service Number +1-833-534-1729 · *Communicated@Fast*How Do I Communicate to Expedia?
