Harmonic Token Projection (HTP): A Vocabulary-Free, Training-Free, Deterministic, and Reversible Embedding Methodology
Tcharlies Schmitz

TL;DR
Harmonic Token Projection (HTP) is a novel, deterministic, and reversible text embedding method that encodes tokens analytically using harmonic trajectories, achieving competitive semantic similarity performance without training or vocabularies.
Contribution
HTP introduces a vocabulary-free, training-free embedding framework based on harmonic analysis, providing a transparent and efficient alternative to neural embeddings.
Findings
Achieves Spearman correlation of 0.68 on STS-B in English
Maintains stable multilingual performance across ten languages
Offers low-latency, computationally efficient embeddings
Abstract
This paper introduces the Harmonic Token Projection (HTP), a reversible and deterministic framework for generating text embeddings without training, vocabularies, or stochastic parameters. Unlike neural embeddings that rely on statistical co-occurrence or optimization, HTP encodes each token analytically as a harmonic trajectory derived from its Unicode integer representation, establishing a bijective and interpretable mapping between discrete symbols and continuous vector space. The harmonic formulation provides phase-coherent projections that preserve both structure and reversibility, enabling semantic similarity estimation from purely geometric alignment. Experimental evaluation on the Semantic Textual Similarity Benchmark (STS-B) and its multilingual extension shows that HTP achieves a Spearman correlation of \r{ho} = 0.68 in English, maintaining stable performance across ten…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Generative Adversarial Networks and Image Synthesis
