Multivariate rank via entropic optimal transport: sample efficiency and generative modeling
Shoaib Bin Masud, Matthew Werenski, James M. Murphy, Shuchin Aeron

TL;DR
This paper introduces entropy-regularized optimal transport-based multivariate rank statistics that are computationally efficient, sample-efficient, differentiable, and useful for high-dimensional goodness-of-fit testing and generative modeling.
Contribution
It proposes the soft rank energy and soft rank MMD statistics using entropic optimal transport, addressing practical limitations of previous methods.
Findings
Non-asymptotic convergence rates of order n^{-1/2} for entropic transport maps.
Fast convergence of sample statistics for high-dimensional goodness-of-fit testing.
Demonstrated utility in image generation and feature selection tasks.
Abstract
The framework of optimal transport has been leveraged to extend the notion of rank to the multivariate setting while preserving desirable properties of the resulting goodness-of-fit (GoF) statistics. In particular, the rank energy (RE) and rank maximum mean discrepancy (RMMD) are distribution-free under the null, exhibit high power in statistical testing, and are robust to outliers. In this paper, we point to and alleviate some of the practical shortcomings of these proposed GoF statistics, namely their high computational cost, high statistical sample complexity, and lack of differentiability with respect to the data. We show that all these practically important issues are addressed by considering entropy-regularized optimal transport maps in place of the rank map, which we refer to as the soft rank. We consequently propose two new statistics, the soft rank energy (sRE) and soft rank…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
