Implicit regularization in AI meets generalized hardness of   approximation in optimization -- Sharp results for diagonal linear networks

Johan S. Wind; Vegard Antun; Anders C. Hansen

arXiv:2307.07410·cs.LG·July 17, 2023·1 cites

Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks

Johan S. Wind, Vegard Antun, Anders C. Hansen

PDF

Open Access 1 Repo

TL;DR

This paper investigates the implicit regularization effects of gradient flow in diagonal linear networks, linking it to phase transitions in generalized hardness of approximation, and provides sharp convergence results and characterizations of solutions.

Contribution

It offers the first sharp analysis of implicit regularization in DLNs, connecting it to GHA phenomena and characterizing the selected minimizers based on network depth.

Findings

01

Gradient flow approximates basis pursuit minimizers.

02

Convergence bounds depend on initialization size.

03

Characterization of which minimizer is chosen based on network depth.

Abstract

Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA). GHA generalizes the phenomenon of hardness of approximation from computer science to, among others, continuous and robust optimization. It is well-known that the $ℓ^{1}$ -norm of the gradient flow of DLNs with tiny initialization converges to the objective function of basis pursuit. We improve upon these results by showing that the gradient flow of DLNs with tiny initialization approximates minimizers of the basis pursuit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

johanwind/which_l1_minimizer
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Photoacoustic and Ultrasonic Imaging · Machine Learning and ELM