Unregularized limit of stochastic gradient method for Wasserstein distributionally robust optimization

Tam Le (LPSM; UPCit\'e)

arXiv:2506.04948·math.OC·June 6, 2025

Unregularized limit of stochastic gradient method for Wasserstein distributionally robust optimization

Tam Le (LPSM, UPCit\'e)

PDF

TL;DR

This paper analyzes the unregularized limit of stochastic gradient methods in Wasserstein distributionally robust optimization, providing convergence guarantees and insights into the behavior of regularized approximations.

Contribution

It establishes the convergence of approximate gradients and critical points in Wasserstein DRO as regularization diminishes, extending analysis to general machine learning settings.

Findings

01

Convergence of approximate gradients over compact sets.

02

Critical points of regularized problems concentrate on original problem critical set.

03

Guarantees for projected stochastic gradient methods in unbounded sample spaces.

Abstract

Distributionally robust optimization offers a compelling framework for model fitting in machine learning, as it systematically accounts for data uncertainty. Focusing on Wasserstein distributionally robust optimization, we investigate the regularized problem where entropic smoothing yields a sampling-based approximation of the original objective. We establish the convergence of the approximate gradient over a compact set, leading to the concentration of the regularized problem critical points onto the original problem critical set as regularization diminishes and the number of approximation samples increases. Finally, we deduce convergence guarantees for a projected stochastic gradient method. Our analysis covers a general machine learning situation with an unbounded sample space and mixed continuous-discrete data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training