Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps

Motoki Omura; Yusuke Mukuta; Kazuki Ota; Takayuki Osa; Tatsuya Harada

arXiv:2507.10843·cs.LG·July 16, 2025

Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps

Motoki Omura, Yusuke Mukuta, Kazuki Ota, Takayuki Osa, Tatsuya Harada

PDF

Open Access

TL;DR

This paper introduces a novel offline reinforcement learning method that uses Wasserstein distance and optimal transport maps via ICNNs to improve policy learning stability and performance without adversarial training.

Contribution

It proposes a Wasserstein regularization approach with ICNNs for optimal transport, avoiding adversarial training and enhancing offline RL stability and effectiveness.

Findings

01

Achieves comparable or better results than existing methods on D4RL benchmark.

02

Utilizes a discriminator-free approach for Wasserstein distance computation.

03

Demonstrates robustness to out-of-distribution actions.

Abstract

Offline reinforcement learning (RL) aims to learn an optimal policy from a static dataset, making it particularly valuable in scenarios where data collection is costly, such as robotics. A major challenge in offline RL is distributional shift, where the learned policy deviates from the dataset distribution, potentially leading to unreliable out-of-distribution actions. To mitigate this issue, regularization techniques have been employed. While many existing methods utilize density ratio-based measures, such as the $f$ -divergence, for regularization, we propose an approach that utilizes the Wasserstein distance, which is robust to out-of-distribution data and captures the similarity between actions. Our method employs input-convex neural networks (ICNNs) to model optimal transport maps, enabling the computation of the Wasserstein distance in a discriminator-free manner, thereby avoiding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Adversarial Robustness in Machine Learning