Structural Similarity for Improved Transfer in Reinforcement Learning

C. Chace Ashcraft; Benjamin Stoler; Chigozie Ewulum; Susama Agarwala

arXiv:2207.13813·cs.LG·July 29, 2022·1 cites

Structural Similarity for Improved Transfer in Reinforcement Learning

C. Chace Ashcraft, Benjamin Stoler, Chigozie Ewulum, Susama Agarwala

PDF

Open Access

TL;DR

This paper introduces SS2, a new state similarity measure based on bisimulation metrics, which improves transfer learning performance in reinforcement learning tasks by quantifying task relatedness.

Contribution

The paper proposes SS2, a novel algorithm for measuring state similarity between tasks in RL, and demonstrates its effectiveness in enhancing transfer learning in GridWorld environments.

Findings

01

SS2 improves transfer performance over previous methods.

02

The similarity measure satisfies metric properties.

03

Empirical results show better Q-Learning transfer in GridWorld.

Abstract

Transfer learning is an increasingly common approach for developing performant RL agents. However, it is not well understood how to define the relationship between the source and target tasks, and how this relationship contributes to successful transfer. We present an algorithm called Structural Similarity for Two MDPS, or SS2, that calculates a state similarity measure for states in two finite MDPs based on previously developed bisimulation metrics, and show that the measure satisfies properties of a distance metric. Then, through empirical results with GridWorld navigation tasks, we provide evidence that the distance measure can be used to improve transfer performance for Q-Learning agents over previous implementations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning

MethodsQ-Learning