Bilevel learning
Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo, Alain Zemkoho

TL;DR
This paper reviews recent advances in bilevel learning, focusing on scalable algorithms, their limitations, and connections to classical bilevel optimization to enable solving more complex large-scale problems.
Contribution
It highlights key algorithmic ideas, discusses limitations of current methods, and proposes directions to bridge bilevel learning with classical bilevel optimization.
Findings
Gradient-based algorithms enable large-scale bilevel problems.
Implicit function assumptions limit applicability to narrow problem classes.
Connections with classical bilevel optimization suggest potential for more general methods.
Abstract
Bilevel learning refers to machine learning problems that can be formulated as bilevel optimization models, where decisions are organized in a hierarchical structure. This paradigm has recently gained considerable attention in machine learning, as gradient-based algorithms built on the implicit function reformulation have enabled the computation of large-scale problems involving possibly millions of variables. Despite these advances, the implicit function framework relies on restrictive assumptions, notably the requirement that the lower-level problem admit a unique optimal solution for each upper-level decision. Moreover, the computation of the derivative of the lower-level optimal solution function becomes significantly more involved when the lower-level problem includes constraints. As a result, many existing bilevel learning algorithms are effective only for relatively narrow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
