Reinforcement learning with timed constraints for robotics motion planning
Zhaoan Wang, Junchao Li, Mahdi Mohammad, Shaoping Xiao

TL;DR
This paper introduces a unified automata-based reinforcement learning framework that enables robots to learn policies satisfying complex temporal constraints expressed in MITL, even under stochastic and partially observable conditions.
Contribution
It develops a novel method translating MITL into automata and integrating it with RL to handle both MDPs and POMDPs for time-critical robotic planning.
Findings
Successfully learned policies in grid-world and service-robot scenarios
Ensured temporal correctness under stochastic dynamics
Scalable to larger state spaces and partial observability
Abstract
Robotic systems operating in dynamic and uncertain environments increasingly require planners that satisfy complex task sequences while adhering to strict temporal constraints. Metric Interval Temporal Logic (MITL) offers a formal and expressive framework for specifying such time-bounded requirements; however, integrating MITL with reinforcement learning (RL) remains challenging due to stochastic dynamics and partial observability. This paper presents a unified automata-based RL framework for synthesizing policies in both Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) under MITL specifications. MITL formulas are translated into Timed Limit-Deterministic Generalized B\"uchi Automata (Timed-LDGBA) and synchronized with the underlying decision process to construct product timed models suitable for Q-learning. A simple yet expressive reward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms
