Variable-Agnostic Causal Exploration for Reinforcement Learning
Minh Hoang Nguyen, Hung Le, Svetha Venkatesh

TL;DR
VACERL introduces a causal exploration framework for reinforcement learning that automatically identifies influential variables and guides exploration without prior causal knowledge, improving performance in complex environments.
Contribution
It presents a novel, variable-agnostic causal exploration method that constructs causal graphs to enhance RL exploration without assuming known causal variables.
Findings
Significant performance improvements in grid-world and 2D games.
Enhanced exploration efficiency in sparse reward scenarios.
Robustness in noisy environments like Noisy-TV.
Abstract
Modern reinforcement learning (RL) struggles to capture real-world cause-and-effect dynamics, leading to inefficient exploration due to extensive trial-and-error actions. While recent efforts to improve agent exploration have leveraged causal discovery, they often make unrealistic assumptions of causal variables in the environments. In this paper, we introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL), incorporating causal relationships to drive exploration in RL without specifying environmental causal variables. Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms. Subsequently, it constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion. This can be leveraged to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
MethodsSoftmax · Attention Is All You Need
