Dynamic Neighborhood Construction for Structured Large Discrete Action   Spaces

Fabian Akkerman; Julius Luy; Wouter van Heeswijk; Maximilian Schiffer

arXiv:2305.19891·cs.LG·September 24, 2024·2 cites

Dynamic Neighborhood Construction for Structured Large Discrete Action Spaces

Fabian Akkerman, Julius Luy, Wouter van Heeswijk, Maximilian Schiffer

PDF

Open Access 1 Repo

TL;DR

This paper introduces Dynamic Neighborhood Construction (DNC), a scalable method for efficiently exploring structured large discrete action spaces in reinforcement learning, capable of handling sizes up to 10^73 actions and outperforming existing approaches.

Contribution

The paper proposes a novel DNC paradigm and heuristic for structured large discrete action spaces, enabling scalable exploration beyond current benchmarks.

Findings

01

DNC matches or outperforms state-of-the-art methods.

02

DNC efficiently explores extremely large action spaces.

03

Method scales to intractably large action spaces.

Abstract

Large discrete action spaces (LDAS) remain a central challenge in reinforcement learning. Existing solution approaches can handle unstructured LDAS with up to a few million actions. However, many real-world applications in logistics, production, and transportation systems have combinatorial action spaces, whose size grows well beyond millions of actions, even on small instances. Fortunately, such action spaces exhibit structure, e.g., equally spaced discrete resource units. With this work, we focus on handling structured LDAS (SLDAS) with sizes that cannot be handled by current benchmarks: we propose Dynamic Neighborhood Construction (DNC), a novel exploitation paradigm for SLDAS. We present a scalable neighborhood exploration heuristic that utilizes this paradigm and efficiently explores the discrete neighborhood around the continuous proxy action in structured action spaces with up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tumbais/dynamicneighborhoodconstruction
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsFocus