Bilevel Optimization under Unbounded Smoothness: A New Algorithm and Convergence Analysis
Jie Hao, Xiaochuan Gong, Mingrui Liu

TL;DR
This paper introduces BO-REP, a novel bilevel optimization algorithm designed for unbounded smoothness scenarios in neural networks, with proven convergence and practical effectiveness in machine learning tasks.
Contribution
The paper proposes BO-REP, a new bilevel optimization algorithm that handles unbounded smoothness, featuring innovative update techniques and matching state-of-the-art convergence rates.
Findings
BO-REP achieves $ ilde{O}(1/\epsilon^4)$ iteration complexity.
The algorithm effectively handles unbounded smoothness in neural networks.
Experimental results show improved performance in hyper-representation learning and hyperparameter optimization.
Abstract
Bilevel optimization is an important formulation for many machine learning problems. Current bilevel optimization algorithms assume that the gradient of the upper-level function is Lipschitz. However, recent studies reveal that certain neural networks such as recurrent neural networks (RNNs) and long-short-term memory networks (LSTMs) exhibit potential unbounded smoothness, rendering conventional bilevel optimization algorithms unsuitable. In this paper, we design a new bilevel optimization algorithm, namely BO-REP, to address this challenge. This algorithm updates the upper-level variable using normalized momentum and incorporates two novel techniques for updating the lower-level variable: \textit{initialization refinement} and \textit{periodic updates}. Specifically, once the upper-level variable is initialized, a subroutine is invoked to obtain a refined estimate of the corresponding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and ELM
