Stochastic Q-learning for Large Discrete Action Spaces

Fares Fourati; Vaneet Aggarwal; Mohamed-Slim Alouini

arXiv:2405.10310·cs.LG·May 17, 2024

Stochastic Q-learning for Large Discrete Action Spaces

Fares Fourati, Vaneet Aggarwal, Mohamed-Slim Alouini

PDF

Open Access

TL;DR

This paper introduces stochastic value-based reinforcement learning methods that efficiently handle large discrete action spaces by considering only a small, stochastic subset of actions per iteration, reducing computational costs while maintaining high performance.

Contribution

The paper proposes novel stochastic Q-learning algorithms that consider a subset of actions, with proven convergence and superior empirical performance over traditional methods.

Findings

01

Outperforms baseline methods in diverse control environments

02

Achieves near-optimal returns with reduced computation time

03

Converges theoretically under certain conditions

Abstract

In complex environments with large discrete action spaces, effective decision-making is critical in reinforcement learning (RL). Despite the widespread use of value-based RL approaches like Q-learning, they come with a computational burden, necessitating the maximization of a value function over all actions in each iteration. This burden becomes particularly challenging when addressing large-scale problems and using deep neural networks as function approximators. In this paper, we present stochastic value-based RL approaches which, in each iteration, as opposed to optimizing over the entire set of $n$ actions, only consider a variable stochastic set of a sublinear number of actions, possibly as small as $O (lo g (n))$ . The presented stochastic value-based RL methods include, among others, Stochastic Q-learning, StochDQN, and StochDDQN, all of which integrate this stochastic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Anomaly Detection Techniques and Applications · Face and Expression Recognition

MethodsSparse Evolutionary Training · Q-Learning