Faster Stochastic Variance Reduction Methods for Compositional MiniMax   Optimization

Jin Liu; Xiaokang Pan; Junwen Duan; Hongdong Li; Youqi Li; Zhe Qu

arXiv:2308.09604·cs.LG·December 13, 2023

Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

Jin Liu, Xiaokang Pan, Junwen Duan, Hongdong Li, Youqi Li, Zhe Qu

PDF

Open Access

TL;DR

This paper introduces NSTORM and ADA-NSTORM, novel stochastic methods for compositional minimax optimization that achieve optimal sample complexity without large batch sizes, with extensive experiments confirming their effectiveness.

Contribution

The paper proposes NSTORM and ADA-NSTORM, new algorithms that improve sample complexity and practicality for compositional minimax optimization in machine learning.

Findings

01

NSTORM achieves optimal $O(rac{ ext{poly}( ext{condition number})}{ ext{accuracy}^3})$ sample complexity.

02

ADA-NSTORM maintains the same complexity with adaptive learning rates.

03

Experimental results demonstrate superior efficiency of the proposed methods.

Abstract

This paper delves into the realm of stochastic optimization for compositional minimax optimization - a pivotal challenge across various machine learning domains, including deep AUC and reinforcement learning policy evaluation. Despite its significance, the problem of compositional minimax optimization is still under-explored. Adding to the complexity, current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes. To respond to these constraints, this paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity of $O (κ^{3} / ϵ^{3})$ to obtain the $ϵ$ -accuracy solution. We also demonstrate that NSTORM can achieve the same sample complexity under the Polyak-\L ojasiewicz (PL)-condition - an insightful extension of its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Advanced Bandit Algorithms Research