Transport Clustering: Solving Low-Rank Optimal Transport via Clustering
Henri Schmidt, Peter Halmos, Ben Raphael

TL;DR
This paper introduces transport clustering, an efficient algorithm for low-rank optimal transport that reduces the problem to clustering, providing approximation guarantees and outperforming existing methods on benchmarks.
Contribution
It presents a polynomial-time approximation algorithm for low-rank OT by reducing it to a clustering problem, with theoretical guarantees and empirical improvements.
Findings
Outperforms existing low-rank OT solvers on benchmarks
Provides polynomial-time approximation algorithms with provable guarantees
Achieves better statistical stability and robustness in OT estimation
Abstract
Optimal transport (OT) finds a least cost transport plan between two probability distributions using a cost matrix defined on pairs of points. Unlike standard OT, which infers unstructured pointwise mappings, low-rank optimal transport explicitly constrains the rank of the transport plan to infer latent structure. This improves statistical stability and robustness, yields sharper parametric rates for estimating Wasserstein distances adaptive to the intrinsic rank, and generalizes -means to co-clustering. These advantages, however, come at the cost of a non-convex and NP-hard optimization problem. We introduce transport clustering, an algorithm to compute a low-rank OT plan that reduces low-rank OT to a clustering problem on correspondences obtained from a full-rank step. We prove that this reduction yields polynomial-time, constant-factor…
Peer Reviews
Decision·Submitted to ICLR 2026
The reduction from low-rank OT to a single clustering problem via cost registration is novel in my opinion, conceptually unifying co-clustering with $K$-means while providing constant-factor guarantees, leveraging mature toolboxes such as Lloyd iterations and SDP relaxations, and consistently improving transport costs over LOT, FRLC, and LatentOT in experiments—all of which makes for a genuinely novel and useful perspective on low-rank OT.
TC still requires a full OT solve up front, so the practical savings are unclear without runtime data; the constant-factor guarantees depend on assumptions and a small gap $\gamma$ that are not quantified empirically; registration for non-square or weighted marginals is only sketched (no experiments); implementation details, for example, ensuring positive column sums, initialization fairness, iteration counts are lack; the cost registration may create prohibitive memory demands.
1. The transport clustering method is elegantly designed. 2. The authors provide theoretical proofs for the approximation factors. 3. Experimental results such as Figure 2 illustrate its advantages over existing low-rank OT methods.
1. When the marginal distributions are arbitrary distributions, the authors outline an extension for the method (Line 267-Line 276), but the paper seems to lack theoretical guarantees for such an extension. For example, does problem (9) even have a solution when the marginal distributions are arbitrary? How robust is this extension? 2. The method relies on existing full-rank OT solvers like the Hungarian algorithm or the Sinkhorn algorithm to register the cost matrix, which undermines the comp
The paper is very well written and addresses an important problem which could be used in the development of computational tools for domain registration and alignment. The proposed methods provide a strong foundation for bridging distributional differences between datasets through transport-based formulations. Furthermore, the techniques presented in the paper could be extended to design large-scale, class-conditioned domain registration frameworks, enabling more structured and semantically meani
One major source of confusion for me is the incessant switching between assignment form partition formulations of low rank optimal transport problem. In the main body results are stated in assignment form, whereas proofs in appendix are written in partition formulation, which makes it harder for reader to follow the proofs.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
