Bilevel Learning with Inexact Stochastic Gradients

Mohammad Sadegh Salehi; Subhadip Mukherjee; Lindon Roberts; Matthias J. Ehrhardt

arXiv:2412.12049·math.OC·May 20, 2025

Bilevel Learning with Inexact Stochastic Gradients

Mohammad Sadegh Salehi, Subhadip Mukherjee, Lindon Roberts, Matthias J. Ehrhardt

PDF

Open Access 1 Repo

TL;DR

This paper introduces a stochastic bilevel learning framework with inexact hypergradients, demonstrating convergence and efficiency improvements for large-scale nonconvex problems in imaging applications.

Contribution

It develops a stochastic bilevel optimization method with convergence guarantees for inexact hypergradients in nonconvex settings, addressing practical limitations of existing approaches.

Findings

01

Significant speed-ups in image denoising and deblurring tasks.

02

Improved generalization over deterministic bilevel methods.

03

Theoretical convergence under mild assumptions.

Abstract

Bilevel learning has gained prominence in machine learning, inverse problems, and imaging applications, including hyperparameter optimization, learning data-adaptive regularizers, and optimizing forward operators. The large-scale nature of these problems has led to the development of inexact and computationally efficient methods. Existing adaptive methods predominantly rely on deterministic formulations, while stochastic approaches often adopt a doubly-stochastic framework with impractical variance assumptions, enforces a fixed number of lower-level iterations, and requires extensive tuning. In this work, we focus on bilevel learning with strongly convex lower-level problems and a nonconvex sum-of-functions in the upper-level. Stochasticity arises from data sampling in the upper-level which leads to inexact stochastic hypergradients. We establish their connection to state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MohammadSadeghSalehi/ISGD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Field-Flow Fractionation Techniques

MethodsADaptive gradient method with the OPTimal convergence rate · Focus