Quantum Reinforcement Learning via Policy Iteration

El Amine Cherrat; Iordanis Kerenidis; Anupam Prakash

arXiv:2203.01889·quant-ph·November 28, 2023·1 cites

Quantum Reinforcement Learning via Policy Iteration

El Amine Cherrat, Iordanis Kerenidis, Anupam Prakash

PDF

Open Access

TL;DR

This paper introduces a quantum reinforcement learning framework using policy iteration, leveraging quantum computing to evaluate and improve policies, with theoretical analysis and experimental validation on OpenAI Gym environments.

Contribution

It presents a novel quantum policy iteration framework, including quantum policy evaluation and improvement methods, advancing quantum reinforcement learning techniques.

Findings

01

Quantum policy evaluation encodes value functions in quantum states.

02

Quantum policy improvement uses measurement post-processing.

03

Experimental results demonstrate potential advantages in OpenAI Gym environments.

Abstract

Quantum computing has shown the potential to substantially speed up machine learning applications, in particular for supervised and unsupervised learning. Reinforcement learning, on the other hand, has become essential for solving many decision making problems and policy iteration methods remain the foundation of such approaches. In this paper, we provide a general framework for performing quantum reinforcement learning via policy iteration. We validate our framework by designing and analyzing: \emph{quantum policy evaluation} methods for infinite horizon discounted problems by building quantum states that approximately encode the value function of a policy $π$ ; and \emph{quantum policy improvement} methods by post-processing measurement outcomes on these quantum states. Last, we study the theoretical and experimental performance of our quantum algorithms on two environments from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum Computing Algorithms and Architecture · Quantum Information and Cryptography · Neural Networks and Reservoir Computing

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings