Train simultaneously, generalize better: Stability of gradient-based   minimax learners

Farzan Farnia; Asuman Ozdaglar

arXiv:2010.12561·cs.LG·October 26, 2020·6 cites

Train simultaneously, generalize better: Stability of gradient-based minimax learners

Farzan Farnia, Asuman Ozdaglar

PDF

Open Access 1 Video

TL;DR

This paper investigates how the choice of optimization algorithms like GDA and PPM affects the generalization ability of minimax learners, revealing that simultaneous training can lead to better generalization in GANs.

Contribution

It provides a theoretical analysis of the generalization properties of GDA and PPM algorithms in minimax problems, highlighting the benefits of simultaneous training.

Findings

01

PPM enjoys bounded excess risk in convex concave problems.

02

GDA's generalization depends on solving subproblems simultaneously.

03

Numerical results support the importance of optimization choice for generalization.

Abstract

The success of minimax learning problems of generative adversarial networks (GANs) has been observed to depend on the minimax optimization algorithm used for their training. This dependence is commonly attributed to the convergence speed and robustness properties of the underlying optimization algorithm. In this paper, we show that the optimization algorithm also plays a key role in the generalization performance of the trained minimax model. To this end, we analyze the generalization properties of standard gradient descent ascent (GDA) and proximal point method (PPM) algorithms through the lens of algorithmic stability under both convex concave and non-convex non-concave minimax settings. While the GDA algorithm is not guaranteed to have a vanishing excess risk in convex concave problems, we show the PPM algorithm enjoys a bounded excess risk in the same setup. For non-convex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Train simultaneously, generalize better: Stability of gradient-based minimax learners· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning