Solving Continual Combinatorial Selection via Deep Reinforcement Learning
Hyungseok Song, Hyeryung Jang, Hai H. Tran, Se-eun Yoon, Kyunghwan, Son, Donggyu Yun, Hyoju Chung, Yung Yi

TL;DR
This paper introduces a deep reinforcement learning approach for solving large-scale combinatorial selection problems by transforming the problem into an iterative form and leveraging weight sharing to handle large state spaces effectively.
Contribution
The paper proposes a novel transformation of the Select-MDP into an Iterative Select-MDP and introduces weight shared Q-networks to efficiently manage large state spaces in deep RL.
Findings
The method scales well to large item spaces.
It maintains performance across different item spaces.
It outperforms baseline approaches in experiments.
Abstract
We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL) algorithms especially when the number of items is huge. In this paper, we present a deep RL algorithm to solve this issue by adopting the following key ideas. First, we convert the original S-MDP into an Iterative Select-MDP (IS-MDP), which is equivalent to the S-MDP in terms of optimal actions. IS-MDP decomposes a joint action of selecting K items simultaneously into K iterative selections resulting in the decrease of actions at the expense of an exponential increase of states. Second, we overcome this state space explo-sion by exploiting a special symmetry in IS-MDPs with novel weight shared Q-networks, which prov-ably maintain sufficient expressive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
