A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers

Roman Tarasov; Petr Mokrov; Milena Gazdieva; Evgeny Burnaev; Alexander Korotin

arXiv:2502.01310·cs.LG·February 25, 2026

A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers

Roman Tarasov, Petr Mokrov, Milena Gazdieva, Evgeny Burnaev, Alexander Korotin

PDF

Open Access 3 Reviews

TL;DR

This paper provides a theoretical analysis of adversarial neural optimal transport methods, establishing generalization error bounds for semi-dual quadratic OT solvers using neural networks, and discusses potential extensions to general OT problems.

Contribution

It introduces the first statistical learning framework for analyzing adversarial neural OT solvers, deriving bounds based on neural network properties, and suggests future work on general OT cases.

Findings

01

Derived upper bounds on generalization error for semi-dual quadratic OT neural networks.

02

Bounds depend on standard properties of neural network function classes.

03

Experimental results support the theoretical findings.

Abstract

Neural network-based optimal transport (OT) is a recent and fruitful direction in the generative modeling community. It finds its applications in various fields such as domain translation, image super-resolution, computational biology and others. Among the existing OT approaches, of considerable interest are adversarial minimax solvers based on semi-dual formulations of OT problems. While promising, these methods lack theoretical investigation from a statistical learning perspective. Our work fills this gap by establishing upper bounds on the generalization error of an approximate OT map recovered by the minimax quadratic OT solver. Importantly, the bounds we derive depend solely on some standard statistical and mathematical properties of the considered functional classes (neural nets). While our analysis focuses on the quadratic OT, we believe that similar bounds could be derived for…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

Theoretical framing is clear, separating approximation, optimization, and estimation errors. The proofs use standard tools from empirical process theory and stability.

Weaknesses

The beta strong convexity condition is assumed rather than derived from the model or data-generating process. Please either prove it under verifiable conditions or provide a constructive check that practitioners can apply to certify it.

Reviewer 02Rating 6Confidence 4

Strengths

The paper is well written, very clear, and accessible to people with familiarity in optimal transportation, learning theory and statistical consistency. Overall, the strategy to tackle the problem is well explained. The related work is well done, especially on semi-dual formulations, which are a getting popular, making the contribution timely. The methodology is standard, but the result is new. Only a subset of results actually depend on the approximation property of certain neural network

Weaknesses

### Missing related work The recent work of Nietert and Goldfeld is relevant and could be added to the related work. Nietert, S. and Goldfeld, Z., *Estimation of Stochastic Optimal Transport Maps*. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. ### Somewhat unconvincing experiments My enthusiasm for the paper is somewhat tempered by the numerical results. **In Figure 3**, the logarithmic scale hides many phenomena. For example, I'd be curious to see w

Reviewer 03Rating 6Confidence 3

Strengths

The paper makes an original theoretical contribution by deriving generalization error bounds for quadratic optimal transport (OT) solvers parameterized by neural networks, particularly within the class of input-convex models. This work provides valuable insights into the learnability and error behavior of neural OT mappings and offers practical guidance for their use in learning-based transport problems. The theoretical framework is rigorous, decomposing the overall generalization error into est

Weaknesses

A key limitation of the paper is its exclusive reliance on low-dimensional synthetic datasets for empirical validation. While the theoretical analysis is sound, the absence of experiments on higher-dimensional or real-world datasets limits the assessment of the framework’s practical relevance and robustness. Extending the experiments to more complex domains would provide stronger empirical support for the theoretical claims and demonstrate the scalability of the proposed bounds. Additionally, t

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and ELM · Neural Networks and Applications