Bandit approach to conflict-free multi-agent Q-learning in view of   photonic implementation

Hiroaki Shinkawa; Nicolas Chauvet; Andr\'e R\"ohm; Takatomo Mihana,; Ryoichi Horisaki; Guillaume Bachelier; and Makoto Naruse

arXiv:2212.09926·cs.AI·December 21, 2022

Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation

Hiroaki Shinkawa, Nicolas Chauvet, Andr\'e R\"ohm, Takatomo Mihana,, Ryoichi Horisaki, Guillaume Bachelier, and Makoto Naruse

PDF

Open Access

TL;DR

This paper introduces a novel photonic multi-agent reinforcement learning scheme using a discontinuous bandit Q-learning algorithm, enabling conflict-free decision-making and accelerated learning in dynamic environments through quantum interference.

Contribution

It proposes a new photonic reinforcement learning algorithm and a multi-agent architecture that leverages quantum interference for conflict-free, faster learning in multi-agent systems.

Findings

01

The proposed algorithm effectively adapts to dynamic environments.

02

Quantum interference enables conflict-free multi-agent decision-making.

03

Simulation results show accelerated learning in multi-agent settings.

Abstract

Recently, extensive studies on photonic reinforcement learning to accelerate the process of calculation by exploiting the physical nature of light have been conducted. Previous studies utilized quantum interference of photons to achieve collective decision-making without choice conflicts when solving the competitive multi-armed bandit problem, a fundamental example of reinforcement learning. However, the bandit problem deals with a static environment where the agent's action does not influence the reward probabilities. This study aims to extend the conventional approach to a more general multi-agent reinforcement learning targeting the grid world problem. Unlike the conventional approach, the proposed scheme deals with a dynamic environment where the reward changes because of agents' actions. A successful photonic reinforcement learning scheme requires both a photonic system that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Semiconductor Lasers and Optical Devices · Quantum Information and Cryptography