Policy Synthesis and Reinforcement Learning for Discounted LTL

Rajeev Alur; Osbert Bastani; Kishor Jothimurugan; Mateo Perez; Fabio; Somenzi; Ashutosh Trivedi

arXiv:2305.17115·cs.LO·May 31, 2023·1 cites

Policy Synthesis and Reinforcement Learning for Discounted LTL

Rajeev Alur, Osbert Bastani, Kishor Jothimurugan, Mateo Perez, Fabio, Somenzi, Ashutosh Trivedi

PDF

Open Access

TL;DR

This paper explores using discounted linear temporal logic (LTL) to improve reinforcement learning policy synthesis in Markov decision processes, addressing sensitivity issues and enabling reduction to discounted-sum rewards.

Contribution

It introduces a method to utilize discounted LTL for policy synthesis and demonstrates reduction to discounted-sum reward with reward machines when discount factors are uniform.

Findings

01

Addresses LTL sensitivity in RL with discounting.

02

Provides reduction of discounted LTL to discounted-sum reward.

03

Applicable to Markov decision processes with unknown transitions.

Abstract

The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL has the downside that it is sensitive to small perturbations in the transition probabilities, which prevents probably approximately correct (PAC) learning without additional assumptions. Time discounting provides a way of removing this sensitivity, while retaining the high expressivity of the logic. We study the use of discounted LTL for policy synthesis in Markov decision processes with unknown transition probabilities, and show how to reduce discounted LTL to discounted-sum reward via a reward machine when all discount factors are identical.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReceptor Mechanisms and Signaling · Reinforcement Learning in Robotics · Formal Methods in Verification