Regret-Free Reinforcement Learning for LTL Specifications
Rupak Majumdar, Mahmoud Salamati, Sadegh Soudjani

TL;DR
This paper introduces the first regret-free online learning algorithm for controlling unknown systems with linear temporal logic specifications, providing finite-time guarantees and improving upon asymptotic methods.
Contribution
It presents a novel regret-free learning algorithm for LTL control in unknown MDPs, including graph structure learning and finite-time performance bounds.
Findings
First regret-free algorithm for LTL control in unknown systems
Finite-time bounds on how close to optimal the controller is
Algorithm for learning system graph structure independently
Abstract
Learning to control an unknown dynamical system with respect to high-level temporal specifications is an important problem in control theory. We present the first regret-free online algorithm for learning a controller for linear temporal logic (LTL) specifications for systems with unknown dynamics. We assume that the underlying (unknown) dynamics is modeled by a finite-state and action Markov decision process (MDP). Our core technical result is a regret-free learning algorithm for infinite-horizon reach-avoid problems on MDPs. For general LTL specifications, we show that the synthesis problem can be reduced to a reach-avoid problem once the graph structure is known. Additionally, we provide an algorithm for learning the graph structure, assuming knowledge of a minimum transition probability, which operates independently of the main regret-free algorithm. Our LTL controller synthesis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElevator Systems and Control · Fuzzy Logic and Control Systems · Multi-Agent Systems and Negotiation
MethodsSparse Evolutionary Training
