Option Encoder: A Framework for Discovering a Policy Basis in   Reinforcement Learning

Arjun Manoharan; Rahul Ramesh; and Balaraman Ravindran

arXiv:1909.04134·cs.LG·July 6, 2020

Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning

Arjun Manoharan, Rahul Ramesh, and Balaraman Ravindran

PDF

TL;DR

This paper introduces Option Encoder, an auto-encoder framework that compresses multiple options into a concise policy basis, improving efficiency and performance in hierarchical reinforcement learning tasks.

Contribution

It proposes a novel auto-encoder based method with constrained weights to discover a compact policy basis representing multiple options.

Findings

01

Effective in grid-world environments

02

Successfully applied to high-dimensional robotic tasks

03

Reduces redundancy and improves task performance

Abstract

Option discovery and skill acquisition frameworks are integral to the functioning of a Hierarchically organized Reinforcement learning agent. However, such techniques often yield a large number of options or skills, which can potentially be represented succinctly by filtering out any redundant information. Such a reduction can reduce the required computation while also improving the performance on a target task. In order to compress an array of option policies, we attempt to find a policy basis that accurately captures the set of all options. In this work, we propose Option Encoder, an auto-encoder based framework with intelligently constrained weights, that helps discover a collection of basis policies. The policy basis can be used as a proxy for the original set of skills in a suitable hierarchically organized framework. We demonstrate the efficacy of our method on a collection of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.