Quantum policy gradient algorithms
Sofiene Jerbi, Arjan Cornelissen, M\=aris Ozols, Vedran Dunjko

TL;DR
This paper develops quantum algorithms for reinforcement learning that can achieve quadratic speed-ups in training policies, especially when using parametrized quantum circuits, highlighting the potential of fully-quantum reinforcement learning.
Contribution
It introduces quantum algorithms for training reinforcement learning policies that exploit quantum interactions, demonstrating advantages with parametrized quantum circuits under certain conditions.
Findings
Quantum algorithms can achieve quadratic speed-ups in sample complexity.
Parametrized quantum circuits produce well-behaved policies for quantum reinforcement learning.
Full quantum reinforcement learning frameworks show promising benefits.
Abstract
Understanding the power and limitations of quantum access to data in machine learning tasks is primordial to assess the potential of quantum computing in artificial intelligence. Previous works have already shown that speed-ups in learning are possible when given quantum access to reinforcement learning environments. Yet, the applicability of quantum algorithms in this setting remains very limited, notably in environments with large state and action spaces. In this work, we design quantum algorithms to train state-of-the-art reinforcement learning policies by exploiting quantum interactions with an environment. However, these algorithms only offer full quadratic speed-ups in sample complexity over their classical analogs when the trained policies satisfy some regularity conditions. Interestingly, we find that reinforcement learning policies derived from parametrized quantum circuits are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
