
TL;DR
This paper reveals that spiking sequence machines and transformers share core functions, formalizes their relationship through phase-latency isomorphism, and evaluates positional encoding methods for sequence tasks.
Contribution
It demonstrates the equivalence of key operations in spiking and transformer models, formalizes the phase-latency relationship, and compares positional encoding strategies empirically.
Findings
Cosine similarity is the shared retrieval primitive in both models.
Frequency-compressed positional encoding fails on positionally demanding tasks.
Learned rank-based embeddings outperform sinusoidal encoding in sequence tasks.
Abstract
Sequence learning reduces to similarity-based retrieval over a temporally indexed representation space, a constraint on any sequence model, not a property of a specific architecture. We show that a spiking Sparse Distributed Memory sequence machine (2007) and the transformer (2017) independently instantiate the same five functional operations (encoding, context maintenance, associative retrieval, storage, and decoding), with cosine similarity as the shared retrieval primitive in both. We formalise a Phase-Latency Isomorphism showing that sinusoidal positional phase and spike timing are linearly related, and prove that dot product attention is invariant to this mapping up to a global scale factor on the positional component (Lemma 1). Empirically, frequency-compressed positional encoding fails to converge on a positionally demanding copy task, while a learned rank-based embedding matches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
