Flat minima generalize for low-rank matrix recovery
Lijun Ding, Dmitriy Drusvyatskiy, Maryam Fazel, Zaid Harchaoui

TL;DR
This paper investigates why flat minima in overparameterized models like neural networks tend to generalize well, focusing on low-rank matrix recovery problems and demonstrating that flat minima often lead to exact or weak recovery of ground truth.
Contribution
It provides a theoretical analysis linking flat minima, measured by Hessian trace, to successful recovery in low-rank matrix problems, extending understanding to neural networks and matrix sensing.
Findings
Flat minima recover ground truth under standard assumptions
Weak recovery in matrix completion, with empirical evidence for exact recovery
Depth influences the flatness and recovery properties in neural networks
Abstract
Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance. Flat minima -- those around which the loss grows slowly -- appear to generalize well. This work takes a step towards understanding this phenomenon by focusing on the simplest class of overparameterized nonlinear models: those arising in low-rank matrix recovery. We analyze overparameterized matrix and bilinear sensing, robust PCA, covariance matrix estimation, and single hidden layer neural networks with quadratic activation functions. In all cases, we show that flat minima, measured by the trace of the Hessian, exactly recover the ground truth under standard statistical assumptions. For matrix completion, we establish weak recovery, although empirical evidence suggests exact recovery…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Photoacoustic and Ultrasonic Imaging · Optical Polarization and Ellipsometry
MethodsPrincipal Components Analysis
