Optimal Transport for Machine Learners

Gabriel Peyr\'e

arXiv:2505.06589·stat.ML·May 13, 2025

Optimal Transport for Machine Learners

Gabriel Peyr\'e

PDF

Open Access

TL;DR

This paper provides a comprehensive overview of Optimal Transport theory, its mathematical foundations, numerical methods, and applications in machine learning, especially for generative models and neural network training.

Contribution

It offers a detailed synthesis of OT's mathematical principles, numerical algorithms, and diverse applications in modern machine learning.

Findings

01

Explains fundamental OT formulations and the Bures metric.

02

Introduces numerical methods like entropic regularization.

03

Highlights applications in neural network training and generative models.

Abstract

Optimal Transport is a foundational mathematical theory that connects optimization, partial differential equations, and probability. It offers a powerful framework for comparing probability distributions and has recently become an important tool in machine learning, especially for designing and evaluating generative models. These course notes cover the fundamental mathematical aspects of OT, including the Monge and Kantorovich formulations, Brenier's theorem, the dual and dynamic formulations, the Bures metric on Gaussian distributions, and gradient flows. It also introduces numerical methods such as linear programming, semi-discrete solvers, and entropic regularization. Applications in machine learning include topics like training neural networks via gradient flows, token dynamics in transformers, and the structure of GANs and diffusion models. These notes focus primarily on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Statistical Mechanics and Entropy

MethodsDiffusion · Focus