Generalization Bounds for Stochastic Saddle Point Problems
Junyu Zhang, Mingyi Hong, Mengdi Wang, Shuzhong Zhang

TL;DR
This paper establishes new generalization bounds for empirical saddle point solutions in stochastic saddle point problems, covering various assumptions and demonstrating near-optimal sample complexity in applications like policy learning and game theory.
Contribution
It introduces the first generalization bounds for ESP solutions in SSP problems, including cases without strong convexity or bounded domains.
Findings
Achieves an $ ext{O}(1/n)$ generalization bound for Lipschitz continuous, strongly convex-strongly concave functions.
Provides bounds under weaker assumptions, including non-strong convexity and unbounded domains.
Shows regularized ESP solutions have near-optimal sample complexity in practical examples.
Abstract
This paper studies the generalization bounds for the empirical saddle point (ESP) solution to stochastic saddle point (SSP) problems. For SSP with Lipschitz continuous and strongly convex-strongly concave objective functions, we establish an generalization bound by using a uniform stability argument. We also provide generalization bounds under a variety of assumptions, including the cases without strong convexity and without bounded domains. We illustrate our results in two examples: batch policy learning in Markov decision process, and mixed strategy Nash equilibrium estimation for stochastic games. In each of these examples, we show that a regularized ESP solution enjoys a near-optimal sample complexity. To the best of our knowledge, this is the first set of results on the generalization theory of ESP.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques
