Empirical Optimal Transport under Estimated Costs: Distributional Limits and Statistical Applications
Shayan Hundrieser, Gilles Mordant, Christoph Alexander Weitkamp, Axel, Munk

TL;DR
This paper derives distributional limits for empirical optimal transport values when costs and measures are estimated from data, enabling statistical inference and stability analysis in various applications.
Contribution
It introduces new distributional limit results for empirical OT with estimated costs, applicable to goodness-of-fit testing, machine learning, and distribution comparison.
Findings
Distributional limits established under weak convergence or fixed parameter optimization.
Applicable to goodness-of-fit testing and invariant transport costs.
Provides bounds and a delta method for OT value fluctuations.
Abstract
Optimal transport (OT) based data analysis is often faced with the issue that the underlying cost function is (partially) unknown. This paper is concerned with the derivation of distributional limits for the empirical OT value when the cost function and the measures are estimated from data. For statistical inference purposes, but also from the viewpoint of a stability analysis, understanding the fluctuation of such quantities is paramount. Our results find direct application in the problem of goodness-of-fit testing for group families, in machine learning applications where invariant transport costs arise, in the problem of estimating the distance between mixtures of distributions, and for the analysis of empirical sliced OT quantities. The established distributional limits assume either weak convergence of the cost process in uniform norm or that the cost is determined by an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Machine Learning and Algorithms · Statistical Methods and Inference
