Sinkhorn Distances: Lightspeed Computation of Optimal Transportation Distances
Marco Cuturi

TL;DR
This paper introduces a new family of optimal transportation distances that are computationally efficient due to entropic regularization and Sinkhorn's algorithm, enabling faster and improved histogram comparison.
Contribution
The authors propose a novel entropic regularization approach for optimal transportation, significantly accelerating computation and enhancing performance in practical retrieval tasks.
Findings
Computational speed is several orders of magnitude faster than traditional methods.
The new distances outperform classical OT in MNIST retrieval tasks.
The approach maintains the theoretical properties of a distance.
Abstract
Optimal transportation distances are a fundamental family of parameterized distances for histograms. Despite their appealing theoretical properties, excellent performance in retrieval tasks and intuitive formulation, their computation involves the resolution of a linear program whose cost is prohibitive whenever the histograms' dimension exceeds a few hundreds. We propose in this work a new family of optimal transportation distances that look at transportation problems from a maximum-entropy perspective. We smooth the classical optimal transportation problem with an entropic regularization term, and show that the resulting optimum is also a distance which can be computed through Sinkhorn-Knopp's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transportation solvers. We also report improved performance over classical optimal transportation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Machine Learning and Algorithms · Multimodal Machine Learning Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
