On the Role of Weight Sharing During Deep Option Learning

Matthew Riemer; Ignacio Cases; Clemens Rosenbaum; Miao Liu; Gerald; Tesauro

arXiv:1912.13408·cs.LG·February 7, 2020

On the Role of Weight Sharing During Deep Option Learning

Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, Miao Liu, Gerald, Tesauro

PDF

TL;DR

This paper investigates the impact of weight sharing in deep option learning within reinforcement learning, revealing that relaxing the independence assumption can improve training stability and speed, especially in complex environments like Atari games.

Contribution

It introduces new algorithms that optimize the full option-critic architecture with shared parameters, challenging previous assumptions of parameter independence.

Findings

01

Improved training stability in deep option learning.

02

Faster convergence in Atari game experiments.

03

Enhanced sample efficiency with shared parameters.

Abstract

The options framework is a popular approach for building temporally extended actions in reinforcement learning. In particular, the option-critic architecture provides general purpose policy gradient theorems for learning actions from scratch that are extended in time. However, past work makes the key assumption that each of the components of option-critic has independent parameters. In this work we note that while this key assumption of the policy gradient theorems of option-critic holds in the tabular case, it is always violated in practice for the deep function approximation setting. We thus reconsider this assumption and consider more general extensions of option-critic and hierarchical option-critic training that optimize for the full architecture with each update. It turns out that not assuming parameter independence challenges a belief in prior work that training the policy over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest