Fast and memory-efficient classical simulation of quantum machine learning via forward and backward gate fusion

Yoshiaki Kawase

arXiv:2603.02804·quant-ph·April 6, 2026

Fast and memory-efficient classical simulation of quantum machine learning via forward and backward gate fusion

Yoshiaki Kawase

PDF

TL;DR

This paper introduces a gate fusion technique combined with gradient checkpointing to significantly accelerate and reduce memory usage in classical simulation of large-scale quantum machine learning models, enabling practical training on large datasets.

Contribution

The authors propose a novel gate fusion method that improves simulation throughput and reduces memory consumption, facilitating large-scale quantum machine learning research.

Findings

01

Achieved approximately 20x throughput improvement on a 12-qubit ansatz.

02

Reached over 30x throughput on a mid-range GPU.

03

Enabled training of a 20-qubit, 1000-layer model with 60,000 parameters in about 20 minutes per epoch.

Abstract

While real quantum devices have been increasingly used to conduct research focused on achieving quantum advantage or quantum utility in recent years, executing deep quantum circuits or performing quantum machine learning with large-scale data on current noisy intermediate-scale quantum devices remains challenging, making classical simulation essential for quantum machine learning research. However, such classical simulation often suffers from the cost of gradient calculations, requiring enormous memory or computational time. To address these problems, we propose a method to fuse multiple consecutive gates in each of the forward and backward paths to improve throughput by minimizing global memory accesses. As a result, we achieved approximately $20$ times throughput improvement for a Hardware-Efficient Ansatz with $12$ or more qubits, reaching over $30$ times improvement on a mid-range…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.