Reinforcement Learning with Temporal-Logic-Based Causal Diagrams
Yash Paliwal, Rajarshi Roy, Jean-Rapha\"el Gaglione, Nasim, Baharisangari, Daniel Neider, Xiaoming Duan, Ufuk Topcu, Zhe Xu

TL;DR
This paper introduces Temporal-Logic-based Causal Diagrams (TL-CDs) to improve reinforcement learning by capturing causal relationships, enabling faster convergence and reduced exploration in temporally extended tasks.
Contribution
The paper proposes TL-CDs as a novel method to incorporate causal knowledge into RL, enhancing efficiency and convergence speed over traditional automata-based approaches.
Findings
TL-CDs enable early reward determination during exploration.
RL algorithms with TL-CDs converge faster to optimal policies.
Case studies show reduced exploration and improved efficiency.
Abstract
We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the environment. To address this limitation, we propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment. We exploit the TL-CD to devise an RL algorithm in which an agent requires significantly less exploration of the environment. To this end, based on a TL-CD and a task DFA, we identify configurations where the agent can determine the expected rewards early during an exploration. Through a series…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Auction Theory and Applications · Optimization and Search Problems
MethodsDirect Feedback Alignment
