Solving Continual Combinatorial Selection via Deep Reinforcement   Learning

Hyungseok Song; Hyeryung Jang; Hai H. Tran; Se-eun Yoon; Kyunghwan; Son; Donggyu Yun; Hyoju Chung; Yung Yi

arXiv:1909.03638·cs.LG·September 10, 2019

Solving Continual Combinatorial Selection via Deep Reinforcement Learning

Hyungseok Song, Hyeryung Jang, Hai H. Tran, Se-eun Yoon, Kyunghwan, Son, Donggyu Yun, Hyoju Chung, Yung Yi

PDF

TL;DR

This paper introduces a deep reinforcement learning approach for solving large-scale combinatorial selection problems by transforming the problem into an iterative form and leveraging weight sharing to handle large state spaces effectively.

Contribution

The paper proposes a novel transformation of the Select-MDP into an Iterative Select-MDP and introduces weight shared Q-networks to efficiently manage large state spaces in deep RL.

Findings

01

The method scales well to large item spaces.

02

It maintains performance across different item spaces.

03

It outperforms baseline approaches in experiments.

Abstract

We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL) algorithms especially when the number of items is huge. In this paper, we present a deep RL algorithm to solve this issue by adopting the following key ideas. First, we convert the original S-MDP into an Iterative Select-MDP (IS-MDP), which is equivalent to the S-MDP in terms of optimal actions. IS-MDP decomposes a joint action of selecting K items simultaneously into K iterative selections resulting in the decrease of actions at the expense of an exponential increase of states. Second, we overcome this state space explo-sion by exploiting a special symmetry in IS-MDPs with novel weight shared Q-networks, which prov-ably maintain sufficient expressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.