Photonic Quantum Policy Learning in OpenAI Gym

D\'aniel Nagy; Zsolt Tabi; P\'eter H\'aga; Zs\'ofia Kallus; and Zolt\'an Zimbor\'as

arXiv:2108.12926·quant-ph·August 31, 2021

Photonic Quantum Policy Learning in OpenAI Gym

D\'aniel Nagy, Zsolt Tabi, P\'eter H\'aga, Zs\'ofia Kallus, and Zolt\'an Zimbor\'as

PDF

TL;DR

This paper explores photonic quantum policy learning for continuous control in OpenAI Gym, demonstrating that quantum agents can match or outperform classical neural networks in speed and effectiveness.

Contribution

It introduces a novel photonic variational quantum agent with proximal policy optimization and studies the impact of data re-uploading in quantum reinforcement learning.

Findings

01

Quantum agents achieved comparable performance to classical neural networks.

02

Quantum agents converged faster than classical counterparts.

03

Photonic quantum policies are promising for NISQ devices in control tasks.

Abstract

In recent years, near-term noisy intermediate scale quantum (NISQ) computing devices have become available. One of the most promising application areas to leverage such NISQ quantum computer prototypes is quantum machine learning. While quantum neural networks are widely studied for supervised learning, quantum reinforcement learning is still just an emerging field of this area. To solve a classical continuous control problem, we use a continuous-variable quantum machine learning approach. We introduce proximal policy optimization for photonic variational quantum agents and also study the effect of the data re-uploading. We present performance assessment via empirical study using Strawberry Fields, a photonic simulator Fock backend and a hybrid training framework connected to an OpenAI Gym environment and TensorFlow. For the restricted CartPole problem, the two variations of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.