Transfer Reinforcement Learning in Heterogeneous Action Spaces using Subgoal Mapping
Kavinayan P. Sivakumar, Yan Zhang, Zachary Bell, Scott Nivison,, Michael M. Zavlanos

TL;DR
This paper introduces a method for transfer reinforcement learning across agents with different action spaces by learning a subgoal mapping using LSTM networks, enhancing sample efficiency and training speed in new tasks.
Contribution
It proposes a novel subgoal mapping approach that generalizes transfer learning across heterogeneous action spaces without requiring handcrafted mappings or sharing policy parameters.
Findings
The method effectively learns subgoal mappings for various tasks.
Imitation of expert policies with the learned mapping improves sample efficiency.
The approach reduces training time for unseen tasks.
Abstract
In this paper, we consider a transfer reinforcement learning problem involving agents with different action spaces. Specifically, for any new unseen task, the goal is to use a successful demonstration of this task by an expert agent in its action space to enable a learner agent learn an optimal policy in its own different action space with fewer samples than those required if the learner was learning on its own. Existing transfer learning methods across different action spaces either require handcrafted mappings between those action spaces provided by human experts, which can induce bias in the learning procedure, or require the expert agent to share its policy parameters with the learner agent, which does not generalize well to unseen tasks. In this work, we propose a method that learns a subgoal mapping between the expert agent policy and the learner agent policy. Since the expert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Robot Manipulation and Learning · Reinforcement Learning in Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
