Multivariate rank via entropic optimal transport: sample efficiency and   generative modeling

Shoaib Bin Masud; Matthew Werenski; James M. Murphy; Shuchin Aeron

arXiv:2111.00043·stat.ML·November 29, 2022

Multivariate rank via entropic optimal transport: sample efficiency and generative modeling

Shoaib Bin Masud, Matthew Werenski, James M. Murphy, Shuchin Aeron

PDF

Open Access 1 Repo

TL;DR

This paper introduces entropy-regularized optimal transport-based multivariate rank statistics that are computationally efficient, sample-efficient, differentiable, and useful for high-dimensional goodness-of-fit testing and generative modeling.

Contribution

It proposes the soft rank energy and soft rank MMD statistics using entropic optimal transport, addressing practical limitations of previous methods.

Findings

01

Non-asymptotic convergence rates of order n^{-1/2} for entropic transport maps.

02

Fast convergence of sample statistics for high-dimensional goodness-of-fit testing.

03

Demonstrated utility in image generation and feature selection tasks.

Abstract

The framework of optimal transport has been leveraged to extend the notion of rank to the multivariate setting while preserving desirable properties of the resulting goodness-of-fit (GoF) statistics. In particular, the rank energy (RE) and rank maximum mean discrepancy (RMMD) are distribution-free under the null, exhibit high power in statistical testing, and are robust to outliers. In this paper, we point to and alleviate some of the practical shortcomings of these proposed GoF statistics, namely their high computational cost, high statistical sample complexity, and lack of differentiability with respect to the data. We show that all these practically important issues are addressed by considering entropy-regularized optimal transport maps in place of the rank map, which we refer to as the soft rank. We consequently propose two new statistics, the soft rank energy (sRE) and soft rank…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shoaibbinmasud/soft-rank-energy-and-applications
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning