Numerically Recovering the Critical Points of a Deep Linear Autoencoder
Charles G. Frye, Neha S. Wadia, Michael R. DeWeese, and Kristofer E., Bouchard

TL;DR
This paper evaluates methods for numerically locating critical points in deep linear autoencoders, revealing limitations in sampling and tolerance, and highlighting a Newton method that improves critical point identification.
Contribution
It systematically assesses existing critical point-finding methods in deep linear autoencoders and introduces insights into their effectiveness and limitations.
Findings
Methods recover some surface information but are biased in sampling.
Strict numerical tolerances are necessary for accurate critical point identification.
A Newton method outperforms previous approaches in finding critical points.
Abstract
Numerically locating the critical points of non-convex surfaces is a long-standing problem central to many fields. Recently, the loss surfaces of deep neural networks have been explored to gain insight into outstanding questions in optimization, generalization, and network architecture design. However, the degree to which recently-proposed methods for numerically recovering critical points actually do so has not been thoroughly evaluated. In this paper, we examine this issue in a case for which the ground truth is known: the deep linear autoencoder. We investigate two sub-problems associated with numerical critical point identification: first, because of large parameter counts, it is infeasible to find all of the critical points for contemporary neural networks, necessitating sampling approaches whose characteristics are poorly understood; second, the numerical tolerance for accurately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
