Inexact bilevel stochastic gradient methods for constrained and   unconstrained lower-level problems

Tommaso Giovannelli; Griffin Dean Kent; Luis Nunes Vicente

arXiv:2110.00604·math.OC·November 8, 2023·1 cites

Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems

Tommaso Giovannelli, Griffin Dean Kent, Luis Nunes Vicente

PDF

Open Access 1 Repo

TL;DR

This paper develops inexact bilevel stochastic gradient methods for constrained and unconstrained problems, providing comprehensive convergence theory and practical algorithms suitable for large-scale machine learning tasks.

Contribution

It introduces a novel bilevel stochastic gradient method with convergence guarantees for problems with nonlinear and possibly nonconvex lower-level constraints, including inexact gradient computations.

Findings

01

Convergence theory covers both constrained and unconstrained lower-level problems.

02

New low-rank stochastic gradient methods avoid second-order derivatives.

03

Algorithms are suitable for large-scale machine learning applications.

Abstract

Two-level stochastic optimization formulations have become instrumental in a number of machine learning contexts such as continual learning, neural architecture search, adversarial learning, and hyperparameter tuning. Practical stochastic bilevel optimization problems become challenging in optimization or learning scenarios where the number of variables is high or there are constraints. In this paper, we introduce a bilevel stochastic gradient method for bilevel problems with nonlinear and possibly nonconvex lower-level constraints. We also present a comprehensive convergence theory that addresses both the lower-level unconstrained and constrained cases and covers all inexact calculations of the adjoint gradient (also called hypergradient), such as the inexact solution of the lower-level problem, inexact computation of the adjoint formula (due to the inexact solution of the adjoint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gdkent/bsg_methods_con_unc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Sparse and Compressive Sensing Techniques

MethodsDifferentiable Architecture Search