A Framework for Transforming Specifications in Reinforcement Learning
Rajeev Alur, Suguman Bansal, Osbert Bastani, Kishor Jothimurugan

TL;DR
This paper introduces a formal framework for transforming reinforcement learning tasks with different objectives, analyzing the theoretical guarantees of such transformations, and identifying limitations in existing approaches for high-level specifications.
Contribution
It develops a formal framework for task transformations in RL, formalizes properties like policy preservation and robustness, and explores the limitations of reductions from temporal logic to reward-based objectives.
Findings
Certain reductions from LTL to reward-based objectives do not exist.
No RL algorithms with PAC-MDP guarantees for safety specifications.
Restates and extends known results on task transformations in RL.
Abstract
Reactive synthesis algorithms allow automatic construction of policies to control an environment modeled as a Markov Decision Process (MDP) that are optimal with respect to high-level temporal logic specifications. However, they assume that the MDP model is known a priori. Reinforcement Learning (RL) algorithms, in contrast, are designed to learn an optimal policy when the transition probabilities of the MDP are unknown, but require the user to associate local rewards with transitions. The appeal of high-level temporal logic specifications has motivated research to develop RL algorithms for synthesis of policies from specifications. To understand the techniques, and nuanced variations in their theoretical guarantees, in the growing body of resulting literature, we develop a formal framework for defining transformations among RL tasks with different forms of objectives. We define the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Safety Systems Engineering in Autonomy · Software Reliability and Analysis Research
