ExTra: Transfer-guided Exploration

Anirban Santara; Rishabh Madan; Balaraman Ravindran; Pabitra Mitra

arXiv:1906.11785·cs.LG·May 28, 2020·1 cites

ExTra: Transfer-guided Exploration

Anirban Santara, Rishabh Madan, Balaraman Ravindran, Pabitra Mitra

PDF

Open Access

TL;DR

ExTra introduces a transfer-guided exploration method in reinforcement learning that leverages optimal policies from related tasks to improve convergence rates and robustness across different environments.

Contribution

The paper proposes a novel transfer-guided exploration method using bisimulation distances to guide action sampling, enhancing convergence and robustness in RL.

Findings

01

ExTra outperforms traditional exploration strategies in gridworlds.

02

ExTra is robust to source task dissimilarity.

03

Combining ExTra with traditional methods improves convergence.

Abstract

In this work we present a novel approach for transfer-guided exploration in reinforcement learning that is inspired by the human tendency to leverage experiences from similar encounters in the past while navigating a new task. Given an optimal policy in a related task-environment, we show that its bisimulation distance from the current task-environment gives a lower bound on the optimal advantage of state-action pairs in the current task-environment. Transfer-guided Exploration (ExTra) samples actions from a Softmax distribution over these lower bounds. In this way, actions with potentially higher optimum advantage are sampled more frequently. In our experiments on gridworld environments, we demonstrate that given access to an optimal policy in a related task-environment, ExTra can outperform popular domain-specific exploration strategies viz. epsilon greedy, Model-Based Interval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsSoftmax