Efficient Gradient Approximation Method for Constrained Bilevel Optimization
Siyuan Xu, Minghui Zhu

TL;DR
This paper introduces a gradient approximation method for constrained bilevel optimization problems, effectively handling large-scale, high-dimensional data with non-convex and non-differentiable objectives, and demonstrates its practical utility.
Contribution
It proposes a novel gradient-based approach that approximates descent directions in constrained bilevel problems, with proven convergence to Clarke stationary points.
Findings
Algorithm converges asymptotically to Clarke stationary points.
Effective in hyperparameter optimization tasks.
Demonstrates utility in meta-learning applications.
Abstract
Bilevel optimization has been developed for many machine learning tasks with large-scale and high-dimensional data. This paper considers a constrained bilevel optimization problem, where the lower-level optimization problem is convex with equality and inequality constraints and the upper-level optimization problem is non-convex. The overall objective function is non-convex and non-differentiable. To solve the problem, we develop a gradient-based approach, called gradient approximation method, which determines the descent direction by computing several representative gradients of the objective function inside a neighborhood of the current estimate. We show that the algorithm asymptotically converges to the set of Clarke stationary points, and demonstrate the efficacy of the algorithm by the experiments on hyperparameter optimization and meta-learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques
