Robust Wasserstein Profile Inference and Applications to Machine   Learning

Jose Blanchet; Yang Kang; and Karthyek Murthy

arXiv:1610.05627·math.ST·October 22, 2020·J. Appl. Probab.

Robust Wasserstein Profile Inference and Applications to Machine Learning

Jose Blanchet, Yang Kang, and Karthyek Murthy

PDF

TL;DR

This paper connects machine learning estimators to distributionally robust optimization using Wasserstein distances, introduces RWPI for optimal uncertainty region selection, and eliminates the need for cross-validation in regularization parameter tuning.

Contribution

It presents a novel interpretation of regularization as adversarial distribution perturbation and introduces RWPI for data-driven regularization parameter selection.

Findings

01

Regularization can be viewed as adversarial distribution perturbation.

02

RWPI enables optimal uncertainty region size selection.

03

Regularization parameters can be chosen without cross-validation.

Abstract

We show that several machine learning estimators, including square-root LASSO (Least Absolute Shrinkage and Selection) and regularized logistic regression can be represented as solutions to distributionally robust optimization (DRO) problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (Robust Wasserstein Profile Inference), a novel inference methodology which extends the use of methods inspired by Empirical Likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.