LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement Learning
Hosein Hasanbeig, Daniel Kroening, Alessandro Abate

TL;DR
LCRL is a software tool that uses model-free reinforcement learning with automata-based reward shaping to synthesize policies satisfying linear temporal logic specifications with proven optimality guarantees.
Contribution
The paper introduces LCRL, a novel framework combining automata-guided reward shaping with model-free RL for policy synthesis under temporal logic constraints, with theoretical guarantees.
Findings
LCRL achieves robust performance in case studies.
LCRL scales well compared to standard RL methods.
Theoretical guarantees ensure convergence to optimal policies.
Abstract
LCRL is a software tool that implements model-free Reinforcement Learning (RL) algorithms over unknown Markov Decision Processes (MDPs), synthesising policies that satisfy a given linear temporal specification with maximal probability. LCRL leverages partially deterministic finite-state machines known as Limit Deterministic Buchi Automata (LDBA) to express a given linear temporal specification. A reward function for the RL algorithm is shaped on-the-fly, based on the structure of the LDBA. Theoretical guarantees under proper assumptions ensure the convergence of the RL algorithm to an optimal policy that maximises the satisfaction probability. We present case studies to demonstrate the applicability, ease of use, scalability, and performance of LCRL. Owing to the LDBA-guided exploration and LCRL model-free architecture, we observe robust performance, which also scales well when compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Advanced Software Engineering Methodologies · Software Reliability and Analysis Research
