Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

Nicolas Dupuis; Adarsh Tiwari; Youssef Mroueh; David Kremer; Ismael Faro; Juan Cruz-Benito

arXiv:2508.20907·quant-ph·August 29, 2025

Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

Nicolas Dupuis, Adarsh Tiwari, Youssef Mroueh, David Kremer, Ismael Faro, Juan Cruz-Benito

PDF

Open Access

TL;DR

This paper introduces quantum-verifiable rewards and post-training techniques for LLMs to improve Qiskit code generation, ensuring code quality and hardware compatibility through quantum verification and synthetic data pipelines.

Contribution

It presents a novel quantum verification method and a synthetic data pipeline for aligning LLMs with quantum hardware requirements, advancing quantum-aware code generation.

Findings

01

Best model surpasses open-source baselines on Qiskit-HumanEval-hard

02

Quantum-verifiable rewards improve code quality and executability

03

Synthetic data pipeline supports effective model training

Abstract

Qiskit is an open-source quantum computing framework that allows users to design, simulate, and run quantum circuits on real quantum hardware. We explore post-training techniques for LLMs to assist in writing Qiskit code. We introduce quantum verification as an effective method for ensuring code quality and executability on quantum hardware. To support this, we developed a synthetic data pipeline that generates quantum problem-unit test pairs and used it to create preference data for aligning LLMs with DPO. Additionally, we trained models using GRPO, leveraging quantum-verifiable rewards provided by the quantum hardware. Our best-performing model, combining DPO and GRPO, surpasses the strongest open-source baselines on the challenging Qiskit-HumanEval-hard benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum Computing Algorithms and Architecture