On the convergence of projective-simulation-based reinforcement learning   in Markov decision processes

Walter L. Boyajian; Jens Clausen; Lea M. Trenkwalder; Vedran; Dunjko; Hans J. Briegel

arXiv:1910.11914·cs.LG·November 13, 2020

On the convergence of projective-simulation-based reinforcement learning in Markov decision processes

Walter L. Boyajian, Jens Clausen, Lea M. Trenkwalder, Vedran, Dunjko, Hans J. Briegel

PDF

TL;DR

This paper provides a formal proof that a version of projective simulation reinforcement learning converges to optimal behavior in Markov decision processes, supporting its theoretical validity.

Contribution

It offers the first formal convergence proof for projective simulation in standard reinforcement learning scenarios.

Findings

01

Proves convergence of a projective simulation model to optimal policies.

02

Establishes theoretical guarantees for a physically-inspired reinforcement learning approach.

03

Supports potential quantum speed-ups in reinforcement learning.

Abstract

In recent years, the interest in leveraging quantum effects for enhancing machine learning tasks has significantly increased. Many algorithms speeding up supervised and unsupervised learning were established. The first framework in which ways to exploit quantum resources specifically for the broader context of reinforcement learning were found is projective simulation. Projective simulation presents an agent-based reinforcement learning approach designed in a manner which may support quantum walk-based speed-ups. Although classical variants of projective simulation have been benchmarked against common reinforcement learning algorithms, very few formal theoretical analyses have been provided for its performance in standard learning scenarios. In this paper, we provide a detailed formal discussion of the properties of this model. Specifically, we prove that one version of the projective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.