Implicit bias of any algorithm: bounding bias via margin
Elvis Dohmatob

TL;DR
This paper establishes a mathematical inequality for the margin function in binary classification, linking the bias of separating hyperplanes to their margin, and provides a tool to analyze the implicit bias of optimization algorithms.
Contribution
It proves a Kurdyka-Lojasiewicz inequality for the margin function, enabling analysis of algorithm bias through margin convergence rates.
Findings
Bias of hyperplane iterates converges at least as fast as the square root of margin convergence rate.
Provides a generic framework for analyzing implicit bias via margin without specialized analysis.
Establishes a bound relating the distance to optimal hyperplane to the margin difference.
Abstract
Consider points in finite-dimensional euclidean space, each having one of two colors. Suppose there exists a separating hyperplane (identified with its unit normal vector for the points, i.e a hyperplane such that points of same color lie on the same side of the hyperplane. We measure the quality of such a hyperplane by its margin , defined as minimum distance between any of the points and the hyperplane. In this paper, we prove that the margin function satisfies a nonsmooth Kurdyka-Lojasiewicz inequality with exponent . This result has far-reaching consequences. For example, let be the maximum possible margin for the problem and let be the parameter for the hyperplane which attains this value. Given any other separating hyperplane with parameter , let be the euclidean distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research · Optimization and Variational Analysis
