A Fully First-Order Method for Stochastic Bilevel Optimization
Jeongyeol Kwon, Dohyun Kwon, Stephen Wright, Robert Nowak

TL;DR
This paper introduces a fully first-order stochastic method for bilevel optimization that achieves rigorous convergence guarantees and outperforms second-order methods in practical experiments.
Contribution
The paper proposes the F2SA algorithm, a first-order method with non-asymptotic convergence guarantees for stochastic bilevel problems, avoiding expensive Hessian computations.
Findings
F2SA converges in $ ilde{O}(rac{1}{ ext{epsilon}^{7/2}})$ iterations with stochastic noise in both levels.
Momentum-enhanced estimators improve convergence to $ ilde{O}(rac{1}{ ext{epsilon}^{5/2}})$ iterations.
F2SA outperforms second-order methods on MNIST hypercleaning tasks.
Abstract
We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend to require possibly expensive calculations regarding Hessians of lower-level objectives, or lack rigorous finite-time performance guarantees. In this work, we propose a Fully First-order Stochastic Approximation (F2SA) method, and study its non-asymptotic convergence properties. Specifically, we show that F2SA converges to an -stationary solution of the bilevel problem after , and iterations (each iteration using samples) when stochastic noises are in both level objectives, only in the upper-level objective, and not present (deterministic settings), respectively. We further show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Risk and Portfolio Optimization
