Learn Dynamic-Aware State Embedding for Transfer Learning
Kaige Yang

TL;DR
This paper proposes a novel transfer learning method in reinforcement learning that infers binary MDP dynamics from trajectories to learn task-agnostic state embeddings, improving sample efficiency across tasks.
Contribution
It introduces an online method to infer binary MDP dynamics from any policy, guiding state embedding learning for effective transfer in reinforcement learning.
Findings
Enhanced transfer learning performance in various tasks.
State embeddings are task and policy agnostic.
Improved exploration with a new intrinsic reward.
Abstract
Transfer reinforcement learning aims to improve the sample efficiency of solving unseen new tasks by leveraging experiences obtained from previous tasks. We consider the setting where all tasks (MDPs) share the same environment dynamic except reward function. In this setting, the MDP dynamic is a good knowledge to transfer, which can be inferred by uniformly random policy. However, trajectories generated by uniform random policy are not useful for policy improvement, which impairs the sample efficiency severely. Instead, we observe that the binary MDP dynamic can be inferred from trajectories of any policy which avoids the need of uniform random policy. As the binary MDP dynamic contains the state structure shared over all tasks we believe it is suitable to transfer. Built on this observation, we introduce a method to infer the binary MDP dynamic on-line and at the same time utilize it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · EEG and Brain-Computer Interfaces
