Parallel bandit architecture based on laser chaos for reinforcement   learning

Takashi Urushibara; Nicolas Chauvet; Satoshi Kochi; Satoshi Sunada,; Kazutaka Kanno; Atsushi Uchida; Ryoichi Horisaki; Makoto Naruse

arXiv:2205.09543·cs.ET·May 20, 2022

Parallel bandit architecture based on laser chaos for reinforcement learning

Takashi Urushibara, Nicolas Chauvet, Satoshi Kochi, Satoshi Sunada,, Kazutaka Kanno, Atsushi Uchida, Ryoichi Horisaki, Makoto Naruse

PDF

Open Access

TL;DR

This paper introduces a novel parallel bandit architecture for reinforcement learning that leverages photonic decision-making and laser chaos, demonstrating faster adaptation and unique state properties compared to traditional Q-learning.

Contribution

The study proposes the PBRL architecture, enabling multi-state reinforcement learning with photonic decision-makers, and shows its advantages over Q-learning in speed and environment adaptation.

Findings

01

PBRL adapts faster than Q-learning in a cart-pole task.

02

Chaotic laser sequences improve learning speed due to autocorrelation.

03

PBRL exhibits distinct state transition properties compared to Q-learning.

Abstract

Accelerating artificial intelligence by photonics is an active field of study aiming to exploit the unique properties of photons. Reinforcement learning is an important branch of machine learning, and photonic decision-making principles have been demonstrated with respect to the multi-armed bandit problems. However, reinforcement learning could involve a massive number of states, unlike previously demonstrated bandit problems where the number of states is only one. Q-learning is a well-known approach in reinforcement learning that can deal with many states. The architecture of Q-learning, however, does not fit well photonic implementations due to its separation of update rule and the action selection. In this study, we organize a new architecture for multi-state reinforcement learning as a parallel array of bandit problems in order to benefit from photonic decision-makers, which we call…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Neurobiology and Insect Physiology Research · Optical Network Technologies

MethodsQ-Learning