Low-rank Optimal Transport: Approximation, Statistics and Debiasing
Meyer Scetbon, Marco Cuturi

TL;DR
This paper investigates low-rank optimal transport (LOT) methods, analyzing their theoretical properties, statistical behavior, and practical advantages, aiming to establish LOT as a competitive and scalable alternative to entropic regularization in large-scale optimal transport problems.
Contribution
The paper provides a comprehensive analysis of low-rank optimal transport, including its theoretical foundations, statistical properties, and practical benefits, positioning it as a viable scalable alternative to entropic regularization.
Findings
LOT offers linear-time algorithms for large datasets.
Theoretical analysis supports LOT's statistical efficiency.
LOT effectively debiases and simplifies hyperparameter tuning.
Abstract
The matching principles behind optimal transport (OT) play an increasingly important role in machine learning, a trend which can be observed when OT is used to disambiguate datasets in applications (e.g. single-cell genomics) or used to improve more complex methods (e.g. balanced attention in transformers or self-supervised learning). To scale to more challenging problems, there is a growing consensus that OT requires solvers that can operate on millions, not thousands, of points. The low-rank optimal transport (LOT) approach advocated in \cite{scetbon2021lowrank} holds several promises in that regard, and was shown to complement more established entropic regularization approaches, being able to insert itself in more complex pipelines, such as quadratic OT. LOT restricts the search for low-cost couplings to those that have a low-nonnegative rank, yielding linear time algorithms in cases…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Gaussian Processes and Bayesian Inference · Distributed Sensor Networks and Detection Algorithms
