UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared, Davis, Adrian Weller

TL;DR
This paper introduces UFO-BLO, an unbiased first-order bilevel optimization method that guarantees convergence without increasing memory or time complexity, supported by theoretical analysis and experiments on meta-learning benchmarks.
Contribution
We propose a new unbiased FO-BLO gradient estimator that ensures convergence, addressing limitations of existing FO-BLO methods.
Findings
Unbiased FO-BLO guarantees convergence to stationary points.
Experimental results validate the effectiveness on few-shot learning benchmarks.
Existing FO-BLO can fail to converge, which our method overcomes.
Abstract
Bilevel optimization (BLO) is a popular approach with many applications including hyperparameter optimization, neural architecture search, adversarial robustness and model-agnostic meta-learning. However, the approach suffers from time and memory complexity proportional to the length of its inner optimization loop, which has led to several modifications being proposed. One such modification is \textit{first-order} BLO (FO-BLO) which approximates outer-level gradients by zeroing out second derivative terms, yielding significant speed gains and requiring only constant memory as varies. Despite FO-BLO's popularity, there is a lack of theoretical understanding of its convergence properties. We make progress by demonstrating a rich family of examples where FO-BLO-based stochastic optimization does not converge to a stationary point of the BLO objective. We address this concern by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Risk and Portfolio Optimization · Stochastic processes and financial applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
