Sparsistency for Inverse Optimal Transport

Francisco Andrade; Gabriel Peyre; Clarice Poon

arXiv:2310.05461·math.ST·March 12, 2024·ICLR·1 cites

Sparsistency for Inverse Optimal Transport

Francisco Andrade, Gabriel Peyre, Clarice Poon

PDF

Open Access 3 Reviews

TL;DR

This paper provides a theoretical analysis of inverse optimal transport, focusing on l1 regularization for sparse ground costs, and establishes connections to graph estimation and classical Lasso methods.

Contribution

It derives a sufficient condition for robust sparsity recovery in inverse optimal transport and explores its relation to Lasso and graphical models.

Findings

01

Derived a generalized Irrepresentability Condition for sparsity recovery.

02

Showed the interpolation between graphical Lasso and classical Lasso via entropic penalty.

03

Connected inverse optimal transport to graph estimation in machine learning.

Abstract

Optimal Transport is a useful metric to compare probability distributions and to compute a pairing given a ground cost. Its entropic regularization variant (eOT) is crucial to have fast algorithms and reflect fuzzy/noisy matchings. This work focuses on Inverse Optimal Transport (iOT), the problem of inferring the ground cost from samples drawn from a coupling that solves an eOT problem. It is a relevant problem that can be used to infer unobserved/missing links, and to obtain meaningful information about the structure of the ground cost yielding the pairing. On one side, iOT benefits from convexity, but on the other side, being ill-posed, it requires regularization to handle the sampling noise. This work presents an in-depth theoretical study of the l1 regularization to model for instance Euclidean costs with sparse interactions between features. Specifically, we derive a sufficient…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 2

Strengths

Inverse optimal transport has recently attracted attention in the community due to its potential impact in ML. Proposing better alternatives to solve this nonlinear inverse problem is thus of interest. The paper provides results on both "full distribution" and finite sample problems. I did not spot any mathematical error, but could not check all of the paper.

Weaknesses

The authors could really afford to improve the pedagogy of the paper, which is quite heavy in terms of notation. Theory is quite involved, require background references to other works such as Carlier or Galichon. The experiment description is a big block which, in my opinion, does not bring as many insights as it could.

Reviewer 02Rating 8· accept, good paperConfidence 2

Strengths

- Inverse optimal transport is an interesting problem and an interesting take on the metric learning problem, this work takes a significant step forward in establishing a theoretical grounding for the regularized iOT problem. - I really enjoyed reading the paper, the writing is very clear, the background, method, and the results are well presented. - Showing that graphical LASSO as a special case of inverse OT is very interesting and makes sense.

Weaknesses

- Nothing that I can think of.

Reviewer 03Rating 8· accept, good paperConfidence 3

Strengths

Finding the cost function from the empirical process induced by the optimal transport algorithm is an important open question. In that respect, this paper addresses an important open question. The connection of the non-degenerate precertificate assumption with irrepresentability assumption in lasso is an interesting insight. Finally, I found the connections to the SVD of the covariance matrix illuminating. The gaussian example also serves to demonstrate the theory through the lenses of an exampl

Weaknesses

Although the sample complexity bound is good, there was no discussion about the tightness of the said bound. I wonder if the $1/\sqrt{n}$ is also the best lower bound. Observing that the empirical density acts like a ``plug-in" for the unknown true coupled density, can the statistical guarantees of the plug-in be translated to the guarantees for the cost function?

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques