Probabilistic Satisfaction of Temporal Logic Constraints in   Reinforcement Learning via Adaptive Policy-Switching

Xiaoshan Lin; Sad{\i}k Bera Y\"uksel; Yasin Yaz{\i}c{\i}o\u{g}lu; and; Derya Aksaray

arXiv:2410.08022·cs.AI·December 2, 2024

Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching

Xiaoshan Lin, Sad{\i}k Bera Y\"uksel, Yasin Yaz{\i}c{\i}o\u{g}lu, and, Derya Aksaray

PDF

Open Access

TL;DR

This paper introduces a novel adaptive policy-switching framework for constrained reinforcement learning that probabilistically ensures temporal logic constraints are satisfied during learning.

Contribution

It proposes a new method that switches between reward maximization and constraint satisfaction, estimating satisfaction probability to improve CRL performance.

Findings

01

The framework effectively balances reward and constraint satisfaction.

02

Theoretical validation confirms the algorithm's correctness.

03

Simulations demonstrate improved constraint adherence during learning.

Abstract

Constrained Reinforcement Learning (CRL) is a subset of machine learning that introduces constraints into the traditional reinforcement learning (RL) framework. Unlike conventional RL which aims solely to maximize cumulative rewards, CRL incorporates additional constraints that represent specific mission requirements or limitations that the agent must comply with during the learning process. In this paper, we address a type of CRL problem where an agent aims to learn the optimal policy to maximize reward while ensuring a desired level of temporal logic constraint satisfaction throughout the learning process. We propose a novel framework that relies on switching between pure learning (reward maximization) and constraint satisfaction. This framework estimates the probability of constraint satisfaction based on earlier trials and properly adjusts the probability of switching between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification