Optimal Transport-Guided Safety in Temporal Difference Reinforcement Learning

Zahra Shahrooei; Ali Baheri

arXiv:2502.16328·cs.LG·June 17, 2025

Optimal Transport-Guided Safety in Temporal Difference Reinforcement Learning

Zahra Shahrooei, Ali Baheri

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel reinforcement learning algorithm that uses optimal transport theory to quantify action uncertainty, promoting safer decision-making while maintaining performance in uncertain environments.

Contribution

It proposes a new temporal difference algorithm that incorporates optimal transport-based uncertainty scores to enhance safety in reinforcement learning policies.

Findings

01

Reduces probability of unsafe state visits

02

Maintains performance under environment uncertainty

03

Provides safer decision-making in stochastic settings

Abstract

The primary goal of reinforcement learning is to develop decision-making policies that prioritize optimal performance, frequently without considering safety. In contrast, safe reinforcement learning seeks to reduce or avoid unsafe behavior. This paper views safety as taking actions with more predictable consequences under environment stochasticity and introduces a temporal difference algorithm that uses optimal transport theory to quantify the uncertainty associated with actions. By integrating this uncertainty score into the decision-making objective, the agent is encouraged to favor actions with more predictable outcomes. We theoretically prove that our algorithm leads to a reduction in the probability of visiting unsafe states. We evaluate the proposed algorithm on several case studies in the presence of various forms of environment uncertainty. The results demonstrate that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sailrit/risk-averse-td-learning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuel Cells and Related Materials · Traffic control and management · Machine Learning and ELM