Exploring the Generalization Capabilities of AID-based Bi-level Optimization
Congliang Chen, Li Shen, Zhiqiang Xu, Wei Liu, Zhi-Quan Luo, Peilin, Zhao

TL;DR
This paper investigates the generalization properties of AID-based bi-level optimization methods, establishing their stability and convergence despite the nonconvex outer function, supported by theoretical analysis and real-world experiments.
Contribution
It provides the first stability and generalization analysis for AID-based bi-level optimization methods, which are traditionally less understood than ITD-based approaches.
Findings
AID-based methods are uniformly stable under certain conditions.
Convergence analysis shows stable step sizes maintain stability.
Experimental results validate theoretical stability and effectiveness.
Abstract
Bi-level optimization has achieved considerable success in contemporary machine learning applications, especially for given proper hyperparameters. However, due to the two-level optimization structure, commonly, researchers focus on two types of bi-level optimization methods: approximate implicit differentiation (AID)-based and iterative differentiation (ITD)-based approaches. ITD-based methods can be readily transformed into single-level optimization problems, facilitating the study of their generalization capabilities. In contrast, AID-based methods cannot be easily transformed similarly but must stay in the two-level structure, leaving their generalization properties enigmatic. In this paper, although the outer-level function is nonconvex, we ascertain the uniform stability of AID-based methods, which achieves similar results to a single-level nonconvex problem. We conduct a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Analysis Techniques
MethodsFocus
