Efficient gradient-based methods for bilevel learning via recycling Krylov subspaces

Matthias J. Ehrhardt; Silvia Gazzola; Sebastian J. Scott

arXiv:2412.08264·math.OC·October 9, 2025

Efficient gradient-based methods for bilevel learning via recycling Krylov subspaces

Matthias J. Ehrhardt, Silvia Gazzola, Sebastian J. Scott

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel recycling Krylov subspace method using Ritz generalized singular vectors to efficiently compute hypergradients in bilevel learning, significantly reducing computational costs in inverse imaging problems.

Contribution

It proposes a new recycling strategy based on Ritz generalized singular vectors and a stopping criterion that directly estimates hypergradient error, advancing bilevel optimization techniques.

Findings

01

Reduces computational cost of hypergradient computation in bilevel learning.

02

Improves convergence and accuracy with the new recycling strategy.

03

Validated through extensive inverse imaging experiments.

Abstract

Many optimization problems require hyperparameters, i.e., parameters that must be pre-specified in advance, such as regularization parameters and parametric regularizers in variational regularization methods for inverse problems, and dictionaries in compressed sensing. A data-driven approach to determine appropriate hyperparameter values is via a nested optimization framework known as bilevel learning. Even when it is possible to employ a gradient-based solver to the bilevel optimization problem, construction of the gradients, known as hypergradients, is computationally challenging, each one requiring both a solution of a minimization problem and a linear system solve. These systems do not change much during the iterations, which motivates us to apply recycling Krylov subspace methods, wherein information from one linear system solve is re-used to solve the next linear system. Existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

s-j-scott/bilevel-recycling
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM