On the Convergence of Single-Loop Stochastic Bilevel Optimization with Approximate Implicit Differentiation
Yubo Zhou, Luo Luo, Guang Dai, Haishan Ye

TL;DR
This paper provides a rigorous convergence analysis of the single-loop stochastic bilevel optimization algorithm SSAID, showing it achieves optimal rates and explicit dependence on the condition number, bridging a theoretical gap.
Contribution
It offers the first explicit convergence rate analysis for stochastic AID-based single-loop methods, matching multi-loop methods' efficiency while clarifying the role of the condition number.
Findings
SSAID achieves an $ ilde{O}(rac{1}{ ext{epsilon}^2})$ convergence rate.
The analysis reveals the explicit $ ext{condition number}^ ext{7}$ dependence.
SSAID's convergence guarantees are comparable to multi-loop algorithms.
Abstract
Stochastic Bilevel Optimization has emerged as a fundamental framework for meta-learning and hyperparameter optimization. Despite the practical prevalence of single-loop algorithms--which update lower and upper variables concurrently--their theoretical understanding, particularly in the stochastic regime, remains significantly underdeveloped compared to their multi-loop counterparts. Existing analyses often yield suboptimal convergence rates or obscure the critical dependence on the lower-level condition number , frequently burying it within generic Lipschitz constants. In this paper, we bridge this gap by providing a refined convergence analysis of the Single-loop Stochastic Approximate Implicit Differentiation (SSAID) algorithm. We prove that SSAID achieves an -stationary point with an oracle complexity of . Our result is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
