Characterizing Implicit Bias in Terms of Optimization Geometry

Suriya Gunasekar; Jason Lee; Daniel Soudry; Nathan Srebro

arXiv:1802.08246·stat.ML·June 24, 2020·47 cites

Characterizing Implicit Bias in Terms of Optimization Geometry

Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

PDF

Open Access

TL;DR

This paper investigates how different optimization geometries influence the implicit bias of algorithms like mirror descent and natural gradient descent in linear models, focusing on the characteristics of the solutions they converge to.

Contribution

It provides a theoretical analysis of the implicit bias induced by various optimization geometries in underdetermined linear problems, independent of hyperparameters.

Findings

01

Different optimization geometries lead to distinct implicit biases.

02

The specific global minimum reached can be characterized by the potential or norm of the optimization geometry.

03

The implicit bias is largely independent of hyperparameter choices such as step-size and momentum.

Abstract

We study the implicit bias of generic optimization methods, such as mirror descent, natural gradient descent, and steepest descent with respect to different potentials and norms, when optimizing underdetermined linear regression or separable linear classification problems. We explore the question of whether the specific global minimum (among the many possible global minima) reached by an algorithm can be characterized in terms of the potential or norm of the optimization geometry, and independently of hyperparameter choices such as step-size and momentum.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms