Quantum-enhanced Large Language Models on Quantum Hardware via Cayley Unitary Adapters

Borja Aizpurua; Sukhbinder Singh; Augustine Kshetrimayum; Saeed S. Jahromi; Roman Orus

arXiv:2605.05914·quant-ph·May 8, 2026

Quantum-enhanced Large Language Models on Quantum Hardware via Cayley Unitary Adapters

Borja Aizpurua, Sukhbinder Singh, Augustine Kshetrimayum, Saeed S. Jahromi, Roman Orus

PDF

TL;DR

This paper demonstrates that quantum circuit blocks inserted into pre-trained large language models on a 156-qubit quantum processor can improve language modeling perplexity and recover some classical compression benefits, showing promise for quantum AI.

Contribution

It introduces Cayley-parameterised unitary adapters for LLMs, achieving perplexity improvements on real quantum hardware with minimal additional parameters.

Findings

01

Perplexity of Llama 3.1 8B improved by 1.4% on quantum hardware.

02

Achieved 83% recovery of compression-induced degradation.

03

Identified a noise-expressivity phase transition at larger qubit scales.

Abstract

Large language models (LLMs) have transformed artificial intelligence, yet classical architectures impose a fundamental constraint: every trainable parameter demands classical memory that scales unfavourably with model size. Quantum computing offers a qualitatively different pathway, but practical demonstrations on real hardware have remained elusive for models of practical relevance. Here we show that Cayley-parameterised unitary adapters -- quantum circuit blocks inserted into the frozen projection layers of pre-trained LLMs and executed on a 156-qubit IBM Quantum System Two superconducting processor -- improve the perplexity of Llama 3.1 8B, an 8-billion-parameter model in widespread use, by 1.4% with only 6,000 additional parameters and end-to-end inference validated on real Quantum Processing Unit (QPU). A systematic study on SmolLM2 (135M parameters), chosen for its tractability,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.