Reinforcement Learning Under Probabilistic Spatio-Temporal Constraints with Time Windows
Xiaoshan Lin, Abbasali Koochakzadeh, Yasin Yazicioglu, Derya Aksaray

TL;DR
This paper introduces an automata-theoretic reinforcement learning method that enforces probabilistic spatio-temporal constraints with time windows, ensuring safety and goal achievement in complex environments.
Contribution
It presents a novel approach translating temporal logic constraints into automata to enforce probabilistic satisfaction during RL, with theoretical guarantees and practical validation.
Findings
Guarantees on probability of constraint satisfaction
Effective exploration in high-reward environments
Successful application to robot pick-up and delivery tasks
Abstract
We propose an automata-theoretic approach for reinforcement learning (RL) under complex spatio-temporal constraints with time windows. The problem is formulated using a Markov decision process under a bounded temporal logic constraint. Different from existing RL methods that can eventually learn optimal policies satisfying such constraints, our proposed approach enforces a desired probability of constraint satisfaction throughout learning. This is achieved by translating the bounded temporal logic constraint into a total automaton and avoiding "unsafe" actions based on the available prior information regarding the transition probabilities, i.e., a pair of upper and lower bounds for each transition probability. We provide theoretical guarantees on the resulting probability of constraint satisfaction. We also provide numerical results in a scenario where a robot explores the environment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Reinforcement Learning in Robotics
