General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai, Vidya Muthukumar

TL;DR
This paper introduces a unified framework to analyze the implicit bias of gradient descent with general convex losses in high-dimensional overparameterized models, revealing approximate interpolation behavior.
Contribution
It extends previous analyses by providing a general primal-dual framework for convex losses, enabling new approximate equivalences and insights into implicit bias in high dimensions.
Findings
Implicit bias approximates minimum-norm interpolation in high dimensions.
Framework recovers exact results for exponential-tailed losses.
Demonstrates effects of specialized loss functions on solutions.
Abstract
We provide a unified framework that applies to a general family of convex losses across binary and multiclass settings in the overparameterized regime to approximately characterize the implicit bias of gradient descent in closed form. Specifically, we show that the implicit bias is approximated (but not exactly equal to) the minimum-norm interpolation in high dimensions, which arises from training on the squared loss. In contrast to prior work, which was tailored to exponentially-tailed losses and used the intermediate support-vector-machine formulation, our framework directly builds on the primal-dual analysis of Ji and Telgarsky (2021), allowing us to provide new approximate equivalences for general convex losses through a novel sensitivity analysis. Our framework also recovers existing exact equivalence results for exponentially-tailed losses across binary and multiclass settings.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
