Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis
Sara Giordano, Kornikar Sen, Miguel A. Martin-Delgado

TL;DR
This paper presents a reinforcement learning framework using hybrid rewards and state discretization to efficiently synthesize near-optimal quantum circuits, significantly improving resource efficiency in quantum state preparation tasks.
Contribution
It introduces a novel hybrid reward mechanism and a circuit-aware approach for RL-based quantum circuit synthesis, addressing scalability and efficiency challenges.
Findings
Successfully synthesizes minimal-depth circuits for up to seven qubits
Demonstrates robustness with universal gate sets
Achieves resource-efficient quantum circuit optimization
Abstract
A reinforcement learning (RL) framework is introduced for the efficient synthesis of quantum circuits that generate specified target quantum states from a fixed initial state, addressing a central challenge in both the Noisy Intermediate-Scale Quantum (NISQ) era and future fault-tolerant quantum computing. The approach utilizes tabular Q-learning, based on action sequences, within a discretized quantum state space, to effectively manage the exponential growth of the space dimension. The framework introduces a hybrid reward mechanism, combining a static, domain-informed reward that guides the agent toward the target state with customizable dynamic penalties that discourage inefficient circuit structures such as gate congestion and redundant state revisits. This is a circuit-aware reward, in contrast to the current trend of works on this topic, which are primarily fidelity-based. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
