Learning Complex Word Embeddings in Classical and Quantum Spaces
Carys Harvey, Stephen Clark, Douglas Brown, Konstantinos, Meichanetzidis

TL;DR
This paper introduces methods for training complex-valued and quantum-inspired word embeddings, demonstrating scalable approaches that perform comparably to classical models on standard benchmarks.
Contribution
It develops a scalable two-stage process for creating complex and quantum-inspired embeddings, enabling large vocabulary training with performance comparable to classical methods.
Findings
Quantum embeddings perform as well as classical embeddings with similar parameters.
Training quantum circuits directly can harm performance.
The approach scales with vocabulary size, not corpus size.
Abstract
We present a variety of methods for training complex-valued word embeddings, based on the classical Skip-gram model, with a straightforward adaptation simply replacing the real-valued vectors with arbitrary vectors of complex numbers. In a more "physically-inspired" approach, the vectors are produced by parameterised quantum circuits (PQCs), which are unitary transformations resulting in normalised vectors which have a probabilistic interpretation. We develop a complex-valued version of the highly optimised C code version of Skip-gram, which allows us to easily produce complex embeddings trained on a 3.8B-word corpus for a vocabulary size of over 400k, for which we are then able to train a separate PQC for each word. We evaluate the complex embeddings on a set of standard similarity and relatedness datasets, for some models obtaining results competitive with the classical baseline. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsSparse Evolutionary Training
