DAC: The Double Actor-Critic Architecture for Learning Options

Shangtong Zhang; Shimon Whiteson

arXiv:1904.12691·cs.LG·September 12, 2019·31 cites

DAC: The Double Actor-Critic Architecture for Learning Options

Shangtong Zhang, Shimon Whiteson

PDF

Open Access 2 Repos

TL;DR

This paper introduces DAC, a novel architecture reformulating the option framework as two parallel augmented MDPs, enabling off-the-shelf policy optimization and improved transfer learning in robot simulation tasks.

Contribution

The paper proposes the Double Actor-Critic (DAC) architecture, reformulating options as parallel MDPs and demonstrating its effectiveness in transfer learning scenarios.

Findings

01

DAC outperforms hierarchy-free and previous option learning algorithms in robot simulations.

02

Only one critic is needed when using state-value functions as critics.

03

The reformulation allows all policy optimization algorithms to be applied off-the-shelf.

Abstract

We reformulate the option framework as two parallel augmented MDPs. Under this novel formulation, all policy optimization algorithms can be used off the shelf to learn intra-option policies, option termination conditions, and a master policy over options. We apply an actor-critic algorithm on each augmented MDP, yielding the Double Actor-Critic (DAC) architecture. Furthermore, we show that, when state-value functions are used as critics, one critic can be expressed in terms of the other, and hence only one critic is necessary. We conduct an empirical study on challenging robot simulation tasks. In a transfer learning setting, DAC outperforms both its hierarchy-free counterpart and previous gradient-based option learning algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Evolutionary Algorithms and Applications