Reinforcement Learning for Parameterized Quantum State Preparation: A Comparative Study
Gerhard Stenzel, Isabella Debelic, Michael K\"olle, Tobias Rohe, Leo S\"unkel, Julian Hager, Claudia Linnhoff-Popien

TL;DR
This study applies reinforcement learning to optimize parameterized quantum circuits for state preparation, comparing one-stage and two-stage training methods, and evaluates their effectiveness on small to medium-sized quantum systems.
Contribution
It introduces a reinforcement learning framework for continuous quantum state preparation and compares two training regimes, providing insights into their performance and scalability.
Findings
PPO outperforms A2C in this setting.
Both methods achieve high success rates for basis and Bell states.
Scalability limits are observed beyond 4 qubits.
Abstract
We extend directed quantum circuit synthesis (DQCS) with reinforcement learning from purely discrete gate selection to parameterized quantum state preparation with continuous single-qubit rotations \(R_x\), \(R_y\), and \(R_z\). We compare two training regimes: a one-stage agent that jointly selects the gate type, the affected qubit(s), and the rotation angle; and a two-stage variant that first proposes a discrete circuit and subsequently optimizes the rotation angles with Adam using parameter-shift gradients. Using Gymnasium and PennyLane, we evaluate Proximal Policy Optimization (PPO) and Advantage Actor--Critic (A2C) on systems comprising two to ten qubits and on targets of increasing complexity with \(\lambda\) ranging from one to five. Whereas A2C does not learn effective policies in this setting, PPO succeeds under stable hyperparameters (one-stage: learning rate approximately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Machine Learning in Materials Science · Quantum many-body systems
