Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji; Junjie Yang; Yingbin Liang

arXiv:2010.07962·cs.LG·August 30, 2021·32 cites

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin Liang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper provides a comprehensive convergence analysis of bilevel optimization algorithms, introduces a new stochastic method with improved efficiency, and validates these findings through experiments in meta-learning and hyperparameter tuning.

Contribution

It offers the first theoretical convergence rates for iterative differentiation, improves existing rates for implicit differentiation, and proposes a novel stochastic algorithm with superior complexity guarantees.

Findings

01

AID-based method's convergence rate is improved with practical parameter choices.

02

First theoretical convergence rate established for ITD-based methods.

03

stocBiO outperforms existing algorithms in efficiency for hyperparameter optimization.

Abstract

Bilevel optimization has arisen as a powerful tool for many machine learning problems such as meta-learning, hyperparameter optimization, and reinforcement learning. In this paper, we investigate the nonconvex-strongly-convex bilevel optimization problem. For deterministic bilevel optimization, we provide a comprehensive convergence rate analysis for two popular algorithms respectively based on approximate implicit differentiation (AID) and iterative differentiation (ITD). For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate. Our analysis also provides a quantitative comparison between ITD and AID based approaches. For stochastic bilevel optimization, we propose a novel algorithm named stocBiO,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Bilevel Optimization: Convergence Analysis and Enhanced Design· slideslive

Taxonomy

TopicsStochastic processes and financial applications