Efficient Stochastic Gradient Descent for Learning with Distributionally   Robust Optimization

Soumyadip Ghosh; Mark Squillante; Ebisa Wollega

arXiv:1805.08728·stat.ML·November 3, 2020·5 cites

Efficient Stochastic Gradient Descent for Learning with Distributionally Robust Optimization

Soumyadip Ghosh, Mark Squillante, Ebisa Wollega

PDF

Open Access

TL;DR

This paper introduces an efficient stochastic gradient descent algorithm for distributionally robust optimization, improving convergence and generalization in machine learning models through a novel subset sampling approach.

Contribution

The paper presents a new stochastic gradient descent method tailored for DRO problems, with theoretical analysis on subset size growth and empirical validation showing improved performance.

Findings

01

The algorithm converges efficiently with increasing subset sizes.

02

Empirical results show improved model generalization.

03

Theoretical analysis balances stochastic error and computational effort.

Abstract

Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore provide a new stochastic gradient descent algorithm to efficiently solve this DRO formulation. Our approach applies gradient descent to the outer minimization formulation and estimates the gradient of the inner maximization based on a sample average approximation. The latter uses a subset of the data in each iteration, progressively increasing the subset size to ensure convergence. Theoretical results include establishing the optimal manner for growing the support size to balance a fundamental tradeoff between stochastic error and computational effort. Empirical results demonstrate the significant benefits of our approach over previous work, and also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques