Implicit bias of any algorithm: bounding bias via margin

Elvis Dohmatob

arXiv:2011.06550·stat.ML·November 24, 2020

Implicit bias of any algorithm: bounding bias via margin

Elvis Dohmatob

PDF

Open Access

TL;DR

This paper establishes a mathematical inequality for the margin function in binary classification, linking the bias of separating hyperplanes to their margin, and provides a tool to analyze the implicit bias of optimization algorithms.

Contribution

It proves a Kurdyka-Lojasiewicz inequality for the margin function, enabling analysis of algorithm bias through margin convergence rates.

Findings

01

Bias of hyperplane iterates converges at least as fast as the square root of margin convergence rate.

02

Provides a generic framework for analyzing implicit bias via margin without specialized analysis.

03

Establishes a bound relating the distance to optimal hyperplane to the margin difference.

Abstract

Consider $n$ points $x_{1}, \dots, x_{n}$ in finite-dimensional euclidean space, each having one of two colors. Suppose there exists a separating hyperplane (identified with its unit normal vector $w)$ for the points, i.e a hyperplane such that points of same color lie on the same side of the hyperplane. We measure the quality of such a hyperplane by its margin $γ (w)$ , defined as minimum distance between any of the points $x_{i}$ and the hyperplane. In this paper, we prove that the margin function $γ$ satisfies a nonsmooth Kurdyka-Lojasiewicz inequality with exponent $1/2$ . This result has far-reaching consequences. For example, let $γ^{o pt}$ be the maximum possible margin for the problem and let $w^{o pt}$ be the parameter for the hyperplane which attains this value. Given any other separating hyperplane with parameter $w$ , let $d (w) := ∥ w - w^{o pt} ∥$ be the euclidean distance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research · Optimization and Variational Analysis