Reinforcement Learning with Competitive Ensembles of   Information-Constrained Primitives

Anirudh Goyal; Shagun Sodhani; Jonathan Binas; Xue Bin Peng; Sergey; Levine; Yoshua Bengio

arXiv:1906.10667·cs.LG·June 26, 2019·23 cites

Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives

Anirudh Goyal, Shagun Sodhani, Jonathan Binas, Xue Bin Peng, Sergey, Levine, Yoshua Bengio

PDF

Open Access

TL;DR

This paper introduces a decentralized reinforcement learning approach where primitives compete based on information needs, leading to improved generalization without a high-level meta-policy.

Contribution

The work proposes a novel primitive-based policy architecture with decentralized decision-making and information-theoretic competition, eliminating the need for a meta-policy.

Findings

01

Outperforms flat policies in generalization tasks

02

Enables primitives to specialize through information regularization

03

Demonstrates effective decentralized decision-making

Abstract

Reinforcement learning agents that operate in diverse and complex environments can benefit from the structured decomposition of their behavior. Often, this is addressed in the context of hierarchical reinforcement learning, where the aim is to decompose a policy into lower-level primitives or options, and a higher-level meta-policy that triggers the appropriate behaviors for a given situation. However, the meta-policy must still produce appropriate decisions in all states. In this work, we propose a policy design that decomposes into primitives, similarly to hierarchical reinforcement learning, but without a high-level meta-policy. Instead, each primitive can decide for themselves whether they wish to act in the current state. We use an information-theoretic mechanism for enabling this decentralized decision: each primitive chooses how much information it needs about the current state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Elevator Systems and Control · Adaptive Dynamic Programming Control