Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch
Michael Wan, Tanmay Gangwani, Jian Peng

TL;DR
This paper introduces a novel transfer learning framework for deep reinforcement learning that enables knowledge transfer between tasks with different state and action spaces using embeddings and mutual information maximization.
Contribution
It proposes a new method for transfer learning in RL with arbitrary state-action mismatch, utilizing embeddings and mutual information to transfer knowledge effectively.
Findings
Successful transfer in robotic locomotion tasks with different state-action spaces
Embeddings enriched by mutual information improve transfer quality
Framework handles complex, high-dimensional tasks effectively
Abstract
Deep reinforcement learning (RL) algorithms have achieved great success on a wide variety of sequential decision-making tasks. However, many of these algorithms suffer from high sample complexity when learning from scratch using environmental rewards, due to issues such as credit-assignment and high-variance gradients, among others. Transfer learning, in which knowledge gained on a source task is applied to more efficiently learn a different but related target task, is a promising approach to improve the sample complexity in RL. Prior work has considered using pre-trained teacher policies to enhance the learning of the student policy, albeit with the constraint that the teacher and the student MDPs share the state-space or the action-space. In this paper, we propose a new framework for transfer learning where the teacher and the student can have arbitrarily different state- and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Evolutionary Algorithms and Applications
