Adaptive Two Sided Laplace Transforms: A Learnable, Interpretable, and Scalable Replacement for Self-Attention
Andrew Kiruluta

TL;DR
This paper introduces a learnable, interpretable, and scalable two-sided Laplace transform mechanism as a replacement for self-attention in transformers, enabling efficient processing of ultra-long sequences with competitive performance.
Contribution
It presents a novel learnable two-sided Laplace transform that replaces self-attention, with adaptive node allocation and efficient computation, improving scalability and interpretability.
Findings
Achieves comparable or better perplexities and scores than existing transformers.
Effectively handles context lengths exceeding 100,000 tokens.
Demonstrates the importance of learnable parameters and adaptive node allocation.
Abstract
We propose an innovative, learnable two-sided short-time Laplace transform (STLT) mechanism to supplant the traditional self attention in transformer-based LLMs. Our STLT introduces trainable parameters for each Laplace node, enabling end-to-end learning of decay rates , oscillatory frequencies, and window bandwidth T. This flexibility allows the model to dynamically adapt token relevance half lives and frequency responses during training. By selecting S learnable nodes and leveraging fast recursive convolution, we achieve an effective complexity of in time and memory. We further incorporate an efficient FFT-based computation of the relevance matrix and an adaptive node allocation mechanism to dynamically adjust the number of active Laplace nodes. Empirical results on language modeling (WikiText\-103, Project Gutenberg), machine translation (WMT'14 En\-De), and long document question…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
