Vectorized Attention with Learnable Encoding for Quantum Transformer

Ziqing Guo; Ziwen Pan; Alex Khan; Jan Balewski

arXiv:2508.18464·quant-ph·September 5, 2025

Vectorized Attention with Learnable Encoding for Quantum Transformer

Ziqing Guo, Ziwen Pan, Alex Khan, Jan Balewski

PDF

TL;DR

This paper introduces the Vectorized Quantum Transformer (VQT), a novel quantum model that improves efficiency and noise resilience in quantum transformer architectures for natural language processing tasks.

Contribution

The paper proposes VQT, supporting ideal masked attention computation with quantum approximation and efficient training, reducing classical sampling overhead and enhancing practical quantum NLP applications.

Findings

01

Demonstrated accuracy on IBM and IonQ quantum simulators.

02

Achieved competitive NLP task results on high-fidelity quantum hardware.

03

Reduced classical sampling overhead in quantum circuit simulation.

Abstract

Vectorized quantum block encoding provides a way to embed classical data into Hilbert space, offering a pathway for quantum models, such as Quantum Transformers (QT), that replace classical self-attention with quantum circuit simulations to operate more efficiently. Current QTs rely on deep parameterized quantum circuits (PQCs), rendering them vulnerable to QPU noise, and thus hindering their practical performance. In this paper, we propose the Vectorized Quantum Transformer (VQT), a model that supports ideal masked attention matrix computation through quantum approximation simulation and efficient training via vectorized nonlinear quantum encoder, yielding shot-efficient and gradient-free quantum circuit simulation (QCS) and reduced classical sampling overhead. In addition, we demonstrate an accuracy comparison for IBM and IonQ in quantum circuit simulation and competitive results in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.