BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach
Mao Ye, Bo Liu, Stephen Wright, Peter Stone, Qiang Liu

TL;DR
This paper introduces a simple, efficient first-order bilevel optimization algorithm suitable for large-scale deep learning tasks, avoiding complex Hessian calculations and demonstrating strong empirical performance.
Contribution
It presents a novel first-order bilevel optimization method that is easy to implement, scalable, and does not require implicit differentiation, with proven convergence guarantees.
Findings
Outperforms existing methods in large-scale deep learning tasks
Requires only first-order gradient information
Proven to converge to stationary points in non-convex settings
Abstract
Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO methods need to differentiate through the low-level optimization process with implicit differentiation, which requires expensive calculations related to the Hessian matrix. There has been a recent quest for first-order methods for BO, but the methods proposed to date tend to be complicated and impractical for large-scale deep learning applications. In this work, we propose a simple first-order BO algorithm that depends only on first-order gradient information, requires no implicit differentiation, and is practical and efficient for large-scale non-convex functions in deep learning. We provide non-asymptotic convergence analysis of the proposed method to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques
