Task-Aware Exploration via a Predictive Bisimulation Metric
Dayang Liang, Ruihan Liu, Lipeng Wan, Yunlong Liu, Bo An

TL;DR
This paper introduces TEB, a task-aware exploration method in visual reinforcement learning that uses a predictive bisimulation metric to improve exploration efficiency under sparse rewards.
Contribution
TEB is the first method to integrate task-relevant representations with exploration via a predictive bisimulation metric, addressing representation collapse and enhancing exploration.
Findings
TEB outperforms recent baselines on MetaWorld and Maze2D.
TEB effectively mitigates representation collapse under sparse rewards.
TEB achieves superior exploration ability in visual RL tasks.
Abstract
Accelerating exploration in visual reinforcement learning under sparse rewards remains challenging due to the substantial task-irrelevant variations. Despite advances in intrinsic exploration, many methods either assume access to low-dimensional states or lack task-aware exploration strategies, thereby rendering them fragile in visual domains. To bridge this gap, we present TEB, a Task-aware Exploration approach that tightly couples task-relevant representations with exploration through a predictive Bisimulation metric. Specifically, TEB leverages the metric not only to learn behaviorally grounded task representations but also to measure behaviorally intrinsic novelty over the learned latent space. To realize this, we first theoretically mitigate the representation collapse of degenerate bisimulation metrics under sparse rewards by internally introducing a simple but effective predicted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · EEG and Brain-Computer Interfaces
