Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis

Sara Giordano; Kornikar Sen; Miguel A. Martin-Delgado

arXiv:2507.16641·quant-ph·February 18, 2026

Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis

Sara Giordano, Kornikar Sen, Miguel A. Martin-Delgado

PDF

TL;DR

This paper presents a reinforcement learning framework using hybrid rewards and state discretization to efficiently synthesize near-optimal quantum circuits, significantly improving resource efficiency in quantum state preparation tasks.

Contribution

It introduces a novel hybrid reward mechanism and a circuit-aware approach for RL-based quantum circuit synthesis, addressing scalability and efficiency challenges.

Findings

01

Successfully synthesizes minimal-depth circuits for up to seven qubits

02

Demonstrates robustness with universal gate sets

03

Achieves resource-efficient quantum circuit optimization

Abstract

A reinforcement learning (RL) framework is introduced for the efficient synthesis of quantum circuits that generate specified target quantum states from a fixed initial state, addressing a central challenge in both the Noisy Intermediate-Scale Quantum (NISQ) era and future fault-tolerant quantum computing. The approach utilizes tabular Q-learning, based on action sequences, within a discretized quantum state space, to effectively manage the exponential growth of the space dimension. The framework introduces a hybrid reward mechanism, combining a static, domain-informed reward that guides the agent toward the target state with customizable dynamic penalties that discourage inefficient circuit structures such as gate congestion and redundant state revisits. This is a circuit-aware reward, in contrast to the current trend of works on this topic, which are primarily fidelity-based. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.