On the Loss Landscape Geometry of Regularized Deep Matrix Factorization: Uniqueness and Sharpness
Anil Kamber, Rahul Parhi

TL;DR
This paper investigates the loss landscape of regularized deep matrix factorization, revealing conditions for uniqueness of minimizers, properties of the Hessian spectrum, and the effects of regularization parameters.
Contribution
It provides a theoretical analysis showing the uniqueness of minimizers and spectral properties of the Hessian in regularized deep matrix factorization problems.
Findings
Unique minimizer exists for almost all target matrices.
Hessian spectrum is constant across all minimizers.
A critical regularization threshold causes the minimizer to collapse to zero.
Abstract
Weight decay is ubiquitous in training deep neural network architectures. Its empirical success is often attributed to capacity control; nonetheless, our theoretical understanding of its effect on the loss landscape and the set of minimizers remains limited. In this paper, we show that -regularized deep matrix factorization/deep linear network training problems with squared-error loss admit a unique end-to-end minimizer for all target matrices subject to factorization, except for a set of Lebesgue measure zero formed by the depth and the regularization parameter. This observation reveals fundamental properties of the loss landscape of regularized deep matrix factorization problems: the Hessian spectrum is constant across all minimizers of the regularized deep scalar factorization problem with squared-error loss. Moreover, we show that, in regularized deep matrix factorization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
