Complex-Valued Phase-Coherent Transformer

Leona Hioki

arXiv:2605.10123·cs.LG·May 12, 2026

Complex-Valued Phase-Coherent Transformer

Leona Hioki

PDF

TL;DR

The paper introduces the Phase-Coherent Transformer (PCT), a novel complex-valued attention mechanism that preserves phase information and outperforms standard softmax Transformers across various benchmarks.

Contribution

It proposes a new phase-coherent attention method that replaces token competition with token-non-competing attention, enhancing generalization and performance in complex-valued Transformers.

Findings

01

PCT outperforms standard softmax Transformers on multiple benchmarks.

02

Gates preserving phase coherence are crucial for long-range retrieval tasks.

03

PCT maintains accuracy across various depths without collapse.

Abstract

Complex-valued Transformers have largely inherited softmax attention from real-valued architectures. However, row-normalised token competition is not necessarily aligned with phase-preserving computation. In this paper, we introduce the Phase-Coherent Transformer (PCT), which applies a real-valued, element-independent, smooth gate to L2-normalised complex query-key similarities. PCT replaces token competition with token-non-competing attention and is designed to preserve phase information across layers. Across mid-scale benchmarks spanning long-range memory, hierarchical long-range reasoning, positional retrieval, phase-based memory and superposition, and image classification, PCT shows strong generalisation across task categories. Under parameter-fair comparison, PCT consistently outperforms both the standard softmax Transformer and its direct complex-valued counterpart. Moreover,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.