Deep Reinforcement Learning in Large Discrete Action Spaces

Gabriel Dulac-Arnold; Richard Evans; Hado van Hasselt; Peter; Sunehag; Timothy Lillicrap; Jonathan Hunt; Timothy Mann and; Theophane Weber; Thomas Degris; Ben Coppin

arXiv:1512.07679·cs.AI·April 5, 2016·266 cites

Deep Reinforcement Learning in Large Discrete Action Spaces

Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter, Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann and, Theophane Weber, Thomas Degris, Ben Coppin

PDF

Open Access 2 Repos

TL;DR

This paper introduces a reinforcement learning method capable of handling large discrete action spaces efficiently by embedding actions in a continuous space and using approximate nearest-neighbor search, enabling applications to problems with up to one million actions.

Contribution

The paper proposes a novel approach combining action embedding and approximate nearest-neighbor search to enable scalable reinforcement learning in large discrete action spaces.

Findings

01

Successfully applied to tasks with up to one million actions

02

Achieves sub-linear complexity in action space size

03

Demonstrates improved scalability over existing methods

Abstract

Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems. Recommender systems, industrial plants and language models are only some of the many real-world tasks involving large numbers of discrete actions for which current methods are difficult or even often impossible to apply. An ability to generalize over the set of actions as well as sub-linear complexity relative to the size of the set are both necessary to handle such tasks. Current approaches are not able to provide both of these, which motivates the work in this paper. Our proposed approach leverages prior information about the actions to embed them in a continuous space upon which it can generalize. Additionally, approximate nearest-neighbor methods allow for logarithmic-time lookup complexity relative to the number of actions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Neural Networks and Applications · Reinforcement Learning in Robotics