Complexity scaling and optimal policy degeneracy in quantum reinforcement learning via analytically solvable unitary-control-then-measure models

Andrea Cintio; Alessandro Michelangeli; Dmitrii Tsutskov

arXiv:2604.13096·math.GM·April 16, 2026

Complexity scaling and optimal policy degeneracy in quantum reinforcement learning via analytically solvable unitary-control-then-measure models

Andrea Cintio, Alessandro Michelangeli, Dmitrii Tsutskov

PDF

TL;DR

This paper introduces analytically solvable quantum reinforcement learning models using a unitary-control-then-measure protocol, revealing complexity reductions and unique degeneracy phenomena in optimal policies.

Contribution

It provides explicit solutions for quantum RL models and uncovers structural complexity reductions and policy degeneracy phenomena not seen in measurement-free control.

Findings

01

Expected return complexity reduces from exponential to polynomial in trajectory length.

02

Identifies two levels of complexity reduction: trajectory-based and policy-based.

03

Discovers unique degeneracy behaviors of optimal policies influenced by quantum Zeno effect.

Abstract

We propose and analyse a class of analytically solvable models of quantum reinforcement learning (QRL), formulated as finite-horizon Markov decision processes in finite-dimensional Hilbert spaces. The models are built around a `unitary-control-then-measure' protocol, in which a learning agent applies unitary transformations to a quantum state and interleaves each control step with a projective measurement onto a prescribed reference basis. Exact closed-form expressions for trajectory probabilities, rewards, and the expected return are derived for four concrete realisations: a closed-chain and an anti-periodic qubit implementation, a qutrit model with ladder coupling, and a four-level two-qubit system. Two structural features of these QRL protocols are rigorously analysed. First, we identify and quantify a two-level reduction in the computational complexity of the expected return, from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.