Reinforcement Learning with Temporal-Logic-Based Causal Diagrams

Yash Paliwal; Rajarshi Roy; Jean-Rapha\"el Gaglione; Nasim; Baharisangari; Daniel Neider; Xiaoming Duan; Ufuk Topcu; Zhe Xu

arXiv:2306.13732·cs.AI·June 27, 2023

Reinforcement Learning with Temporal-Logic-Based Causal Diagrams

Yash Paliwal, Rajarshi Roy, Jean-Rapha\"el Gaglione, Nasim, Baharisangari, Daniel Neider, Xiaoming Duan, Ufuk Topcu, Zhe Xu

PDF

Open Access

TL;DR

This paper introduces Temporal-Logic-based Causal Diagrams (TL-CDs) to improve reinforcement learning by capturing causal relationships, enabling faster convergence and reduced exploration in temporally extended tasks.

Contribution

The paper proposes TL-CDs as a novel method to incorporate causal knowledge into RL, enhancing efficiency and convergence speed over traditional automata-based approaches.

Findings

01

TL-CDs enable early reward determination during exploration.

02

RL algorithms with TL-CDs converge faster to optimal policies.

03

Case studies show reduced exploration and improved efficiency.

Abstract

We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the environment. To address this limitation, we propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment. We exploit the TL-CD to devise an RL algorithm in which an agent requires significantly less exploration of the environment. To this end, based on a TL-CD and a task DFA, we identify configurations where the agent can determine the expected rewards early during an exploration. Through a series…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Auction Theory and Applications · Optimization and Search Problems

MethodsDirect Feedback Alignment