Taking gradients through experiments: LSTMs and memory proximal policy   optimization for black-box quantum control

Moritz August; Jos\'e Miguel Hern\'andez-Lobato

arXiv:1802.04063·cs.LG·April 16, 2018

Taking gradients through experiments: LSTMs and memory proximal policy optimization for black-box quantum control

Moritz August, Jos\'e Miguel Hern\'andez-Lobato

PDF

TL;DR

This paper introduces a novel reinforcement learning approach using LSTM networks and a new variant of PPO, called MPPO, to optimize black-box quantum control tasks, achieving state-of-the-art results.

Contribution

The paper proposes a new method, MPPO, tailored for quantum control problems, integrating LSTM-based policy gradients for improved performance.

Findings

01

Achieves state-of-the-art results in quantum control tasks.

02

Demonstrates effectiveness of LSTM-based policies in black-box quantum control.

03

Introduces a new variant of PPO, MPPO, for quantum reinforcement learning.

Abstract

In this work we introduce the application of black-box quantum control as an interesting rein- forcement learning problem to the machine learning community. We analyze the structure of the reinforcement learning problems arising in quantum physics and argue that agents parameterized by long short-term memory (LSTM) networks trained via stochastic policy gradients yield a general method to solving them. In this context we introduce a variant of the proximal policy optimization (PPO) algorithm called the memory proximal policy optimization (MPPO) which is based on this analysis. We then show how it can be applied to specific learning tasks and present results of nu- merical experiments showing that our method achieves state-of-the-art results for several learning tasks in quantum control with discrete and continouous control parameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.