Compressed online Sinkhorn

Fengpei Wang; Clarice Poon; Tony Shardlow

arXiv:2310.05019·cs.LG·October 10, 2023

Compressed online Sinkhorn

Fengpei Wang, Clarice Poon, Tony Shardlow

PDF

Open Access 3 Reviews

TL;DR

This paper improves the convergence analysis of the online Sinkhorn algorithm for optimal transport and introduces a compressed version that combines measure compression with online Sinkhorn, demonstrating both theoretical and practical benefits.

Contribution

It provides a faster convergence rate analysis for the online Sinkhorn algorithm and proposes a compressed variant that enhances efficiency through measure compression techniques.

Findings

01

Faster convergence rate under certain parameters.

02

Numerical results verify the sharpness of the theoretical analysis.

03

Practical numerical gains with the compressed online Sinkhorn.

Abstract

The use of optimal transport (OT) distances, and in particular entropic-regularised OT distances, is an increasingly popular evaluation metric in many areas of machine learning and data science. Their use has largely been driven by the availability of efficient algorithms such as the Sinkhorn algorithm. One of the drawbacks of the Sinkhorn algorithm for large-scale data processing is that it is a two-phase method, where one first draws a large stream of data from the probability distributions, before applying the Sinkhorn algorithm to the discrete probability measures. More recently, there have been several works developing stochastic versions of Sinkhorn that directly handle continuous streams of data. In this work, we revisit the recently introduced online Sinkhorn algorithm of [Mensch and Peyr\'e, 2020]. Our contributions are twofold: We improve the convergence analysis for the…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 2

Strengths

The problem of estimating dual potentials for OT on continous distribution is a difficult one. For this reason, despite being incremental in nature, I think the result may be important.

Weaknesses

On the negative side, I had a hard time appreciating the five different assumptions made in the paper. I couldn’t quite tell whether they were necessary or they were made as a matter of convenience. Also, the paper is written in a way that makes it only accessible to people who are familiar with previous work (and not for folks who may have a good understanding of the optimal transport problem but lack familiarity with online Sinkhorn). I’m still not able to fully appreciate the result and under

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

The idea of compressing measures seems very interesting from an algorithmic point of view. The analysis is quite simple. (See below, however.) The previous bound for the first algorithm does seem to have been incorrect.

Weaknesses

The bounds depend on a constant $\kappa$ that can be quite small. The proofs are fairly straightforward. Proof writing leaves a bit to be desired and I had trouble following some arguments. 1) The constant $\kappa$ and the fact that it is at most $1$ are explained for the first time 2) I believe the Lipschitz constant in Lemma 4 (with the notation employed) should be $L$, or maybe the formula for $T_\beta$ is missing a $1/\epsilon$ factor in the exponent. 3) The last equality in the first m

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

The authors provide two settings for which their compression can be implemented: Gaussian quadrature and Fourier moments compression. They fix a minor error in a proof of a previous paper on online Sinkhorn. Numerical evidence is presented. The paper is well written.

Weaknesses

The experiments are done in settings of very low dimensionality. For the one of greater dimension (d=5), the uncompressed method starts to look quite better.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Stochastic Gradient Optimization Techniques