Replay-buffer engineering for noise-robust quantum circuit optimization

Akash Kundu; Sebastian Feld

arXiv:2604.21863·quant-ph·April 24, 2026

Replay-buffer engineering for noise-robust quantum circuit optimization

Akash Kundu, Sebastian Feld

PDF

TL;DR

This paper introduces novel replay-buffer strategies and evaluation methods to enhance noise-robust quantum circuit optimization using deep reinforcement learning, achieving significant efficiency and accuracy improvements.

Contribution

It presents ReaPER+ for improved replay sampling, OptCRLQAS for faster evaluations, and a transfer scheme for noisy settings, advancing scalable quantum optimization techniques.

Findings

01

ReaPER+ achieves 4-32x sample efficiency gains.

02

OptCRLQAS cuts evaluation time by up to 67.5%.

03

Transfer scheme reduces steps to chemical accuracy by 85-90%.

Abstract

Deep reinforcement learning (RL) for quantum circuit optimization faces three fundamental bottlenecks: replay buffers that ignore the reliability of temporal-difference (TD) targets, curriculum-based architecture search that triggers a full quantum-classical evaluation at every environment step, and the routine discard of noiseless trajectories when retraining under hardware noise. We address all three by treating the replay buffer as a primary algorithmic lever for quantum optimization. We introduce ReaPER $+$ , an annealed replay rule that transitions from TD error-driven prioritization early in training to reliability-aware sampling as value estimates mature, achieving $4 - 32 \times$ gains in sample efficiency over fixed PER, ReaPER, and uniform replay while consistently discovering more compact circuits across quantum compilation and QAS benchmarks; validation on LunarLander-v3 confirms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.