Large-Scale Methods for Distributionally Robust Optimization
Daniel Levy, Yair Carmon, John C. Duchi, Aaron Sidford

TL;DR
This paper introduces scalable algorithms for distributionally robust optimization with CVaR and $ ext{chi}^2$ divergence, achieving theoretical efficiency guarantees suitable for large datasets and outperforming traditional methods in experiments.
Contribution
It provides the first gradient evaluation complexity bounds for $ ext{chi}^2$ uncertainty sets and improved linear scaling for CVaR, with proven optimality and practical efficiency.
Findings
Algorithms require gradient evaluations independent of dataset size.
Experimental results show 9-36 times efficiency gains over full-batch methods.
Theoretical guarantees are validated on MNIST and ImageNet datasets.
Abstract
We propose and analyze algorithms for distributionally robust optimization of convex losses with conditional value at risk (CVaR) and divergence uncertainty sets. We prove that our algorithms require a number of gradient evaluations independent of training set size and number of parameters, making them suitable for large-scale applications. For uncertainty sets these are the first such guarantees in the literature, and for CVaR our guarantees scale linearly in the uncertainty level rather than quadratically as in previous work. We also provide lower bounds proving the worst-case optimality of our algorithms for CVaR and a penalized version of the problem. Our primary technical contributions are novel bounds on the bias of batch robust risk estimation and the variance of a multilevel Monte Carlo gradient estimator due to [Blanchet & Glynn, 2015]. Experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRisk and Portfolio Optimization · Advanced Control Systems Optimization · Supply Chain and Inventory Management
