Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks
Johan S. Wind, Vegard Antun, Anders C. Hansen

TL;DR
This paper investigates the implicit regularization effects of gradient flow in diagonal linear networks, linking it to phase transitions in generalized hardness of approximation, and provides sharp convergence results and characterizations of solutions.
Contribution
It offers the first sharp analysis of implicit regularization in DLNs, connecting it to GHA phenomena and characterizing the selected minimizers based on network depth.
Findings
Gradient flow approximates basis pursuit minimizers.
Convergence bounds depend on initialization size.
Characterization of which minimizer is chosen based on network depth.
Abstract
Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA). GHA generalizes the phenomenon of hardness of approximation from computer science to, among others, continuous and robust optimization. It is well-known that the -norm of the gradient flow of DLNs with tiny initialization converges to the objective function of basis pursuit. We improve upon these results by showing that the gradient flow of DLNs with tiny initialization approximates minimizers of the basis pursuit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Photoacoustic and Ultrasonic Imaging · Machine Learning and ELM
