Gradient Equilibrium in Online Learning: Theory and Applications
Anastasios N. Angelopoulos, Michael I. Jordan, Ryan J. Tibshirani

TL;DR
This paper introduces the concept of gradient equilibrium in online learning, demonstrating its attainability with standard methods and its practical applications in prediction calibration, debiasing, and scoring under distribution shifts.
Contribution
It defines gradient equilibrium in online learning, shows it can be achieved with simple methods, and applies it to improve prediction fairness, calibration, and scoring under distribution shifts.
Findings
Gradient equilibrium can be achieved with gradient descent and mirror descent.
Gradient equilibrium enables debiasing under distribution shift.
Post hoc gradient updates improve calibration and scoring accuracy.
Abstract
We present a new perspective on online learning that we refer to as gradient equilibrium: a sequence of iterates achieves gradient equilibrium if the average of gradients of losses along the sequence converges to zero. In general, this condition is not implied by, nor implies, sublinear regret. It turns out that gradient equilibrium is achievable by standard online learning methods such as gradient descent and mirror descent with constant step sizes (rather than decaying step sizes, as is usually required for no regret). Further, as we show through examples, gradient equilibrium translates into an interpretable and meaningful property in online prediction problems spanning regression, classification, quantile estimation, and others. Notably, we show that the gradient equilibrium framework can be used to develop a debiasing scheme for black-box predictions under arbitrary distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications
MethodsHigh-Order Consensuses
