High-performance Vector-length Agnostic Quantum Circuit Simulations on ARM Processors
Ruimin Shi, Gabin Schieffer, Pei-Hung Lin, Maya Gokhale, Andreas Herten, Ivy Peng

TL;DR
This paper explores vector-length agnostic design for quantum state-vector simulations on ARM processors, achieving significant speedups through optimized techniques and providing insights for future portable high-performance quantum computing implementations.
Contribution
It introduces a VLA design and optimization techniques for quantum simulations, demonstrating portability and performance improvements across multiple ARM-based processors.
Findings
Up to 4.5x speedup on A64FX
Up to 2.5x speedup on Grace
Up to 1.5x speedup on Graviton
Abstract
ARM SVE and RISC-V RVV are emerging vector architectures in high-end processors that support vectorization of flexible vector length. In this work, we leverage an important workload for quantum computing, quantum state-vector simulations, to understand whether high-performance portability can be achieved in a vector-length agnostic (VLA) design. We propose a VLA design and optimization techniques critical for achieving high performance, including VLEN-adaptive memory layout adjustment, load buffering, fine-grained loop control, and gate fusion-based arithmetic intensity adaptation. We provide an implementation in Google's Qsim and evaluate five quantum circuits of up to 36 qubits on three ARM processors, including NVIDIA Grace, AWS Graviton3, and Fujitsu A64FX. By defining new metrics and PMU events to quantify vectorization activities, we draw generic insights for future VLA designs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Parallel Computing and Optimization Techniques · Quantum-Dot Cellular Automata
