Detecting Flaky Tests in Quantum Software: A Dynamic Approach
Dongchan Kim, Hamidreza Khoramrokh, Lei Zhang, Andriy Miranskyy

TL;DR
This study conducts the first large-scale dynamic analysis of flaky tests in quantum software, revealing their episodic nature, low overall rate, and uneven distribution across components, highlighting detection challenges.
Contribution
It provides the first empirical characterization of flaky tests in quantum software through extensive dynamic testing and introduces a dataset for future research.
Findings
Flaky tests are rare but episodic in quantum software.
Detection of flaky tests requires many executions due to low failure probabilities.
Flakiness is unevenly distributed across software components.
Abstract
Flaky tests, tests that pass or fail nondeterministically without changes to code or environment, pose a serious threat to software reliability. While classical software engineering has developed a rich body of dynamic and static techniques to study flakiness, corresponding evidence for quantum software remains limited. Prior work relies primarily on static analysis or small sets of manually reported incidents, leaving open questions about the prevalence, characteristics, and detectability of flaky tests. This paper presents the first large-scale dynamic characterization of flaky tests in quantum software. We executed the Qiskit Terra test suite 10,000 times across 23 releases in controlled environments. For each release, we measured test-outcome variability, identified flaky tests, estimated empirical failure probabilities, analyzed recurrence across versions, and used Wilson…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Distributed systems and fault tolerance · Software Reliability and Analysis Research
