Extending a Quantum Reinforcement Learning Exploration Policy with Flags to Connect Four
Filipe Santos (1), Jo\~ao Paulo Fernandes (2), Lu\'is Macedo (1) ((1), CISUC, DEI, University of Coimbra, (2) LIACC, New York University Abu Dhabi)

TL;DR
This paper extends a quantum reinforcement learning exploration policy with flags to the game of Connect Four, demonstrating improved sampling efficiency and superior exploration compared to epsilon-greedy, while analyzing the impact of game position and turn order.
Contribution
It applies and evaluates a quantum RL exploration policy with flags in Connect Four, highlighting its effectiveness and efficiency in a new, more complex setting.
Findings
Quantum agents sample flagged actions in fewer iterations.
Flagged exploration outperforms epsilon-greedy policy.
Win rates are similar between classical and quantum agents.
Abstract
Action selection based on flags is a Reinforcement Learning (RL) exploration policy that improves the exploration of the state space through the use of flags, which can identify the most promising actions to take in each state. The quantum counterpart of this exploration policy further improves upon this by taking advantage of a quadratic speedup for sampling flagged actions. This approach has already been successfully employed for the game of Checkers. In this work, we describe the application of this method to the context of Connect Four, in order to study its performance in a different setting, which can lead to a better generalization of the technique. We also kept track of a metric that wasn't taken into account in previous work: the average number of iterations to obtain a flagged action. Since going second is a significant disadvantage in Connect Four, we also had the intent of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Quantum many-body systems · Quantum Information and Cryptography
