Energy-guided Entropic Neural Optimal Transport
Petr Mokrov, Alexander Korotin, Alexander Kolesov, Nikita, Gushchin, Evgeny Burnaev

TL;DR
This paper introduces a novel method combining energy-based models with entropy-regularized optimal transport, providing theoretical guarantees and demonstrating scalability to high-resolution image translation tasks.
Contribution
It bridges EBMs and neural OT, offering a new approach with theoretical bounds and practical validation on complex image translation tasks.
Findings
Proves generalization bounds for the proposed method.
Validates applicability on toy and image domains.
Scales to high-resolution unpaired image translation.
Abstract
Energy-based models (EBMs) are known in the Machine Learning community for decades. Since the seminal works devoted to EBMs dating back to the noughties, there have been a lot of efficient methods which solve the generative modelling problem by means of energy potentials (unnormalized likelihood functions). In contrast, the realm of Optimal Transport (OT) and, in particular, neural OT solvers is much less explored and limited by few recent works (excluding WGAN-based approaches which utilize OT as a loss function and do not model OT maps themselves). In our work, we bridge the gap between EBMs and Entropy-regularized OT. We present a novel methodology which allows utilizing the recent developments and technical improvements of the former in order to enrich the latter. From the theoretical perspective, we prove generalization bounds for our technique. In practice, we validate its…
Peer Reviews
Decision·ICLR 2024 poster
The paper is overall well written and presented, and the the ideas are original to the knowledge of the reviewer. The discussion of all results seem plenty and extensive. Some strengths: 1. The paper points out that the semi dual form of EOT works well for approximation of optimal coupling. The reviewer finds it interesting, as in semi dual form, essentially only one of the two equations of the Schrödinger system is satisfied, thus the other marginal usually lacks control. However, as shown in t
Some weaknesses: 1. One major concern is the bound in Theorem 4, where a classical bound of error illustrating the balance between approximation and estimation is provided, using Rademacher complexity and optimality gap. However, it is unclear what is expected as the overall rate from this bound, as the choice of parametric class remains heuristic. This can be important, as generative tasks usually operates in high dimensions, and the dependence in dimensionality seems crucial to justify the app
The arguments in the paper are clear and straightforward. The paper is well structured with the contributions highlighted clearly. The background is well-presented and I don't see typos. Figure 2 is well-made and shows the efficacy of their method.
The biggest weakness of their proposed method uses energy-based training which involves MCMC. I am unclear if this is ideal as MCMC can be tricky. It would be interesting to see if the unpaired image-to-image task can be done with other OT methods to better see how useful this particular formulation and method is.
This research uncovers the connection between energy-based models and entropy-regularized optimal transport, opening up new applications for EBMs, including unpaired data-to-data translation.
My primary concern centres around the scalability of the proposed approach. The training process hinges on simulating MCMC, which poses significant challenges when dealing with high-dimensional datasets. While promising results have been demonstrated in experiments on high-dimensional unpaired image-to-image translation, it is worth noting that this approach couples with a pretrained GAN model and conducts training in latent spaces. I hold reservations about its direct applicability to image spa
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Applications · Advanced Neural Network Applications
MethodsDense Connections · Feedforward Network · HuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Adaptive Instance Normalization · StyleGAN · Convolution · Wasserstein GAN
