Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization

Xuelin Zhang; Hong Chen; Bin Gu; Tieliang Gong; Feng Zheng

arXiv:2604.04090·cs.LG·April 7, 2026

Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization

Xuelin Zhang, Hong Chen, Bin Gu, Tieliang Gong, Feng Zheng

PDF

TL;DR

This paper analyzes the generalization properties of stochastic bilevel optimization methods, establishing stability-based bounds across various settings and validating findings experimentally.

Contribution

It provides the first systematic stability-based generalization analysis for first-order gradient-based SBO methods across multiple problem settings.

Findings

01

Derived upper bounds on argument stability for SBO methods.

02

Established connections between stability and generalization gap.

03

Experimental results support theoretical bounds.

Abstract

Stochastic bilevel optimization (SBO) has been integrated into many machine learning paradigms recently, including hyperparameter optimization, meta learning, and reinforcement learning. Along with the wide range of applications, there have been numerous studies on the computational behavior of SBO. However, the generalization guarantees of SBO methods are far less understood from the lens of statistical learning theory. In this paper, we provide a systematic generalization analysis of the first-order gradient-based bilevel optimization methods. Firstly, we establish the quantitative connections between the on-average argument stability and the generalization gap of SBO methods. Then, we derive the upper bounds of on-average argument stability for single-timescale stochastic gradient descent (SGD) and two-timescale SGD, where three settings (nonconvex-nonconvex (NC-NC), convex-convex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.