Quantum-enhanced reinforcement learning for finite-episode games with   discrete state spaces

Florian Neukart; David Von Dollen; Christian Seidel; Gabriele; Compostella

arXiv:1708.09354·quant-ph·September 18, 2017

Quantum-enhanced reinforcement learning for finite-episode games with discrete state spaces

Florian Neukart, David Von Dollen, Christian Seidel, Gabriele, Compostella

PDF

TL;DR

This paper demonstrates how quantum annealing can be used to enhance reinforcement learning in finite-episode games with discrete states by embedding policy evaluation and value function approximation as QUBO problems on a D-Wave quantum processor.

Contribution

It introduces methods to embed Monte Carlo policy evaluation and value function approximation into quantum annealing hardware for reinforcement learning tasks.

Findings

01

Quantum-enhanced policy evaluation finds better or equivalent value functions.

02

Embedding RL problems as QUBO enables quantum hardware utilization.

03

Quantum-classical algorithms improve efficiency in finite-episode games.

Abstract

Quantum annealing algorithms belong to the class of metaheuristic tools, applicable for solving binary optimization problems. Hardware implementations of quantum annealing, such as the quantum annealing machines produced by D-Wave Systems, have been subject to multiple analyses in research, with the aim of characterizing the technology's usefulness for optimization and sampling tasks. Here, we present a way to partially embed both Monte Carlo policy iteration for finding an optimal policy on random observations, as well as how to embed (n) sub-optimal state-value functions for approximating an improved state-value function given a policy for finite horizon games with discrete state spaces on a D-Wave 2000Q quantum processing unit (QPU). We explain how both problems can be expressed as a quadratic unconstrained binary optimization (QUBO) problem, and show that quantum-enhanced Monte…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.