Reinforcement learning for quantum processes with memory

Josep Lumbreras; Ruo Cheng Huang; Yanglin Hu; Marco Fanizza; Mile Gu

arXiv:2603.25138·quant-ph·March 27, 2026

Reinforcement learning for quantum processes with memory

Josep Lumbreras, Ruo Cheng Huang, Yanglin Hu, Marco Fanizza, Mile Gu

PDF

Open Access

TL;DR

This paper develops a reinforcement learning framework for quantum systems with hidden memory, providing algorithms with provable sublinear regret bounds and demonstrating applications in quantum thermodynamics.

Contribution

It introduces a quantum reinforcement learning model with unknown quantum memory, extending algorithms to continuous actions, and establishes optimal regret bounds with physical applications.

Findings

01

Regret scales as (((K))) over K episodes.

02

Proves the optimality of the regret bounds via information-theoretic lower bounds.

03

Demonstrates sublinear dissipation in quantum work extraction using the proposed algorithm.

Abstract

In reinforcement learning, an agent interacts sequentially with an environment to maximize a reward, receiving only partial, probabilistic feedback. This creates a fundamental exploration-exploitation trade-off: the agent must explore to learn the hidden dynamics while exploiting this knowledge to maximize its target objective. While extensively studied classically, applying this framework to quantum systems requires dealing with hidden quantum states that evolve via unknown dynamics. We formalize this problem via a framework where the environment maintains a hidden quantum memory evolving via unknown quantum channels, and the agent intervenes sequentially using quantum instruments. For this setting, we adapt an optimistic maximum-likelihood estimation algorithm. We extend the analysis to continuous action spaces, allowing us to model general positive operator-valued measures (POVMs).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Thermodynamics and Statistical Mechanics · Quantum Computing Algorithms and Architecture · Advanced Bandit Algorithms Research