Extending a Quantum Reinforcement Learning Exploration Policy with Flags   to Connect Four

Filipe Santos (1); Jo\~ao Paulo Fernandes (2); Lu\'is Macedo (1) ((1); CISUC; DEI; University of Coimbra; (2) LIACC; New York University Abu Dhabi)

arXiv:2505.04371·cs.LG·May 9, 2025

Extending a Quantum Reinforcement Learning Exploration Policy with Flags to Connect Four

Filipe Santos (1), Jo\~ao Paulo Fernandes (2), Lu\'is Macedo (1) ((1), CISUC, DEI, University of Coimbra, (2) LIACC, New York University Abu Dhabi)

PDF

Open Access

TL;DR

This paper extends a quantum reinforcement learning exploration policy with flags to the game of Connect Four, demonstrating improved sampling efficiency and superior exploration compared to epsilon-greedy, while analyzing the impact of game position and turn order.

Contribution

It applies and evaluates a quantum RL exploration policy with flags in Connect Four, highlighting its effectiveness and efficiency in a new, more complex setting.

Findings

01

Quantum agents sample flagged actions in fewer iterations.

02

Flagged exploration outperforms epsilon-greedy policy.

03

Win rates are similar between classical and quantum agents.

Abstract

Action selection based on flags is a Reinforcement Learning (RL) exploration policy that improves the exploration of the state space through the use of flags, which can identify the most promising actions to take in each state. The quantum counterpart of this exploration policy further improves upon this by taking advantage of a quadratic speedup for sampling flagged actions. This approach has already been successfully employed for the game of Checkers. In this work, we describe the application of this method to the context of Connect Four, in order to study its performance in a different setting, which can lead to a better generalization of the technique. We also kept track of a metric that wasn't taken into account in previous work: the average number of iterations to obtain a flagged action. Since going second is a significant disadvantage in Connect Four, we also had the intent of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum Computing Algorithms and Architecture · Quantum many-body systems · Quantum Information and Cryptography