Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning
Derya Aksaray, Yasin Yazicioglu, Ahmet Semi Asarkaya

TL;DR
This paper introduces a new reinforcement learning method that guarantees probabilistic satisfaction of temporal logic constraints during learning, ensuring constraints are met in each episode rather than after many episodes.
Contribution
It presents an automata-theoretic approach to ensure probabilistic constraint satisfaction in each episode, differing from traditional penalty-based methods, with theoretical guarantees and practical demonstration.
Findings
Guarantees probabilistic satisfaction of constraints in each episode.
Provides a lower bound on the probability of constraint satisfaction.
Demonstrates effectiveness in a drone navigation scenario.
Abstract
We propose a novel constrained reinforcement learning method for finding optimal policies in Markov Decision Processes while satisfying temporal logic constraints with a desired probability throughout the learning process. An automata-theoretic approach is proposed to ensure the probabilistic satisfaction of the constraint in each episode, which is different from penalizing violations to achieve constraint satisfaction after a sufficiently large number of episodes. The proposed approach is based on computing a lower bound on the probability of constraint satisfaction and adjusting the exploration behavior as needed. We present theoretical results on the probabilistic constraint satisfaction achieved by the proposed approach. We also numerically demonstrate the proposed idea in a drone scenario, where the constraint is to perform periodically arriving pick-up and delivery tasks and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Formal Methods in Verification
