Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

TL;DR
This paper investigates how different optimization geometries influence the implicit bias of algorithms like mirror descent and natural gradient descent in linear models, focusing on the characteristics of the solutions they converge to.
Contribution
It provides a theoretical analysis of the implicit bias induced by various optimization geometries in underdetermined linear problems, independent of hyperparameters.
Findings
Different optimization geometries lead to distinct implicit biases.
The specific global minimum reached can be characterized by the potential or norm of the optimization geometry.
The implicit bias is largely independent of hyperparameter choices such as step-size and momentum.
Abstract
We study the implicit bias of generic optimization methods, such as mirror descent, natural gradient descent, and steepest descent with respect to different potentials and norms, when optimizing underdetermined linear regression or separable linear classification problems. We explore the question of whether the specific global minimum (among the many possible global minima) reached by an algorithm can be characterized in terms of the potential or norm of the optimization geometry, and independently of hyperparameter choices such as step-size and momentum.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
