Bilevel Optimization: Convergence Analysis and Enhanced Design
Kaiyi Ji, Junjie Yang, Yingbin Liang

TL;DR
This paper provides a comprehensive convergence analysis of bilevel optimization algorithms, introduces a new stochastic method with improved efficiency, and validates these findings through experiments in meta-learning and hyperparameter tuning.
Contribution
It offers the first theoretical convergence rates for iterative differentiation, improves existing rates for implicit differentiation, and proposes a novel stochastic algorithm with superior complexity guarantees.
Findings
AID-based method's convergence rate is improved with practical parameter choices.
First theoretical convergence rate established for ITD-based methods.
stocBiO outperforms existing algorithms in efficiency for hyperparameter optimization.
Abstract
Bilevel optimization has arisen as a powerful tool for many machine learning problems such as meta-learning, hyperparameter optimization, and reinforcement learning. In this paper, we investigate the nonconvex-strongly-convex bilevel optimization problem. For deterministic bilevel optimization, we provide a comprehensive convergence rate analysis for two popular algorithms respectively based on approximate implicit differentiation (AID) and iterative differentiation (ITD). For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate. Our analysis also provides a quantitative comparison between ITD and AID based approaches. For stochastic bilevel optimization, we propose a novel algorithm named stocBiO,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic processes and financial applications
