Stability and Generalization for Stochastic Recursive Momentum-based   Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations

Xiaokang Pan; Xingyu Li; Jin Liu; Tao Sun; Kai Sun; Lixing Chen; Zhe; Qu

arXiv:2407.05286·cs.LG·July 9, 2024

Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations

Xiaokang Pan, Xingyu Li, Jin Liu, Tao Sun, Kai Sun, Lixing Chen, Zhe, Qu

PDF

Open Access

TL;DR

This paper analyzes the generalization performance of STORM-based stochastic optimization algorithms across multiple levels, providing stability-based bounds and insights into how levels and batch sizes affect their ability to generalize.

Contribution

It offers the first comprehensive stability analysis for multi-level STORM algorithms, linking stability to generalization and deriving excess risk bounds.

Findings

01

Stability decreases with variance in estimators.

02

More levels can increase generalization error.

03

Larger initial batch sizes improve generalization.

Abstract

STOchastic Recursive Momentum (STORM)-based algorithms have been widely developed to solve one to $K$ -level ( $K \geq 3$ ) stochastic optimization problems. Specifically, they use estimators to mitigate the biased gradient issue and achieve near-optimal convergence results. However, there is relatively little work on understanding their generalization performance, particularly evident during the transition from one to $K$ -level optimization contexts. This paper provides a comprehensive generalization analysis of three representative STORM-based algorithms: STORM, COVER, and SVMR, for one, two, and $K$ -level stochastic optimizations under both convex and strongly convex settings based on algorithmic stability. Firstly, we define stability for $K$ -level optimizations and link it to generalization. Then, we detail the stability results for three prominent STORM-based algorithms. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Stochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods