State Space Decomposition and Subgoal Creation for Transfer in Deep   Reinforcement Learning

Himanshu Sahni; Saurabh Kumar; Farhan Tejani; Yannick Schroecker,; Charles Isbell

arXiv:1705.08997·cs.AI·May 26, 2017·2 cites

State Space Decomposition and Subgoal Creation for Transfer in Deep Reinforcement Learning

Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Yannick Schroecker,, Charles Isbell

PDF

Open Access

TL;DR

This paper introduces a framework enabling deep reinforcement learning agents to generalize policies across domains by decomposing tasks into subtasks using a recurrent attention mechanism guided by a meta-controller.

Contribution

It proposes a novel attention-based method for subgoal creation that improves policy transferability in deep RL, addressing generalization limitations.

Findings

01

Meta-controller learns to create effective subgoals within attention.

02

Attention mechanism enhances policy generalization across domains.

03

Baseline without attention performs less effectively.

Abstract

Typical reinforcement learning (RL) agents learn to complete tasks specified by reward functions tailored to their domain. As such, the policies they learn do not generalize even to similar domains. To address this issue, we develop a framework through which a deep RL agent learns to generalize policies from smaller, simpler domains to more complex ones using a recurrent attention mechanism. The task is presented to the agent as an image and an instruction specifying the goal. This meta-controller guides the agent towards its goal by designing a sequence of smaller subtasks on the part of the state space within the attention, effectively decomposing it. As a baseline, we consider a setup without attention as well. Our experiments show that the meta-controller learns to create subgoals within the attention.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Fault Detection and Control Systems